Bug 54332 - [4.8 Regression] 481.wrf in SPEC CPU 2006 takes > 10GB memory to compile
Summary: [4.8 Regression] 481.wrf in SPEC CPU 2006 takes > 10GB memory to compile
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.8.0
: P3 normal
Target Milestone: 4.8.0
Assignee: Diego Novillo
URL:
Keywords: memory-hog
: 54269 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-08-20 18:06 UTC by H.J. Lu
Modified: 2012-08-22 08:59 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-08-21 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2012-08-20 18:06:27 UTC
On Linux/x86-64, revision 190521 takes > 10 GB memory to compile
481.wrf in SPEC CPU 2006:

gfortran -c -o module_domain.fppized.o -I. -I./netcdf/include -O3 -funroll-loops -ffast-math -frecord-marker=4 module_domain.fppized.f90
Comment 1 H.J. Lu 2012-08-20 20:53:14 UTC
It is caused by revision 190402:

http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00379.html
Comment 2 H.J. Lu 2012-08-20 21:18:00 UTC
Revision 190401 takes 512MB virtual memory to compile module_domain.fppized.f90
while revision 190402 takes 10GB. This is a 20x increase.
Comment 3 H.J. Lu 2012-08-20 21:41:46 UTC
Revision 190402 memory stat:

Memory consumption before IPA
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             36k         11k       1080 
16            96k         82k       2112 
32           132k         62k       2376 
64          1460k       1338k         22k
256          440k        432k       6160 
512         8192        3584         112 
1024          16k         11k        224 
2048          12k       8192         168 
4096          84k         84k       1176 
8192          32k         32k        224 
16384         48k         48k        168 
65536         64k         64k         56 
131072        128k        128k         56 
262144        512k        512k        112 
24            68k         32k       1224 
40          3740k       1482k         58k
48          2108k        876k         32k
56          1396k        913k         21k
72            28k       2232         392 
80           104k        101k       1456 
88            16k       2200         224 
96          4252k       4140k         58k
112           16k         12k        224 
120         8192        5160         112 
184           72k         67k       1008 
128           36k         14k        504 
152         4324k       4168k         59k
168          788k        362k         10k
160          504k        487k       7056 
104         2836k       2770k         38k
312          212k        208k       2968 
136         4096        2448          56 
Total         23M         18M        331k

String pool
entries		10710
identifiers	7074 (66.05%)
slots		16384
deleted		3636
bytes		86k (17592186044415M overhead)
table size	128k
coll/search	0.6146
ins/search	0.2503
avg. entry	8.25 bytes (+/- 8.27)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 996 elements, 1.024948 collisions
DECL_DEBUG_EXPR  hash: size 1021, 294 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 0 disambiguations, 0 queries
  ref_maybe_used_by_call_p: 0 disambiguations, 0 queries
  call_may_clobber_ref_p: 0 disambiguations, 0 queries

PTA query stats:
  pt_solution_includes: 0 disambiguations, 0 queries
  pt_solutions_intersect: 0 disambiguations, 0 queries
Memory consumption after IPA
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             36k         11k       1080 
16            96k         82k       2112 
32           132k         70k       2376 
64          1460k       1160k         22k
256         1724k       1722k         23k
512         8192        5120         112 
1024          28k         27k        392 
2048          12k       8192         168 
4096         116k        116k       1624 
8192          32k         32k        224 
16384       4544k       4544k         15k
65536         64k         64k         56 
131072        128k        128k         56 
262144        512k        512k        112 
524288        512k        512k         56 
24           200k         74k       3600 
40          3740k       1280k         58k
48          2108k        425k         32k
56          1396k        910k         21k
72            28k       2232         392 
80          4824k       2840k         65k
88            16k       2024         224 
96          4252k       1954k         58k
112           16k         13k        224 
120         8192        6240         112 
184           72k         67k       1008 
128           36k         14k        504 
152         4320k       3675k         59k
168          788k        362k         10k
160          504k        441k       7056 
104         2836k       2327k         38k
312          212k        209k       2968 
136         4096        2448          56 
Total         33M         23M        431k

String pool
entries		10723
identifiers	7066 (65.90%)
slots		16384
deleted		3656
bytes		86k (17592186044415M overhead)
table size	128k
coll/search	0.6149
ins/search	0.2505
avg. entry	8.23 bytes (+/- 8.27)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 996 elements, 1.024948 collisions
DECL_DEBUG_EXPR  hash: size 1021, 290 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 0 disambiguations, 0 queries
  ref_maybe_used_by_call_p: 1022 disambiguations, 1 queries
  call_may_clobber_ref_p: 1022 disambiguations, 1022 queries

PTA query stats:
  pt_solution_includes: 294 disambiguations, 1048 queries
  pt_solutions_intersect: 2887 disambiguations, 289744 queries
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             24k         10k        720 
16          1132k        250k         24k
32          2648k        534k         46k
64          3884k       1108k         60k
256          504k        439k       7056 
512         4096        2048          56 
1024          12k       6144         168 
4096          88k         88k       1232 
8192          32k         32k        224 
16384         32k         32k        112 
32768         32k         32k         56 
65536        128k        128k        112 
262144        256k        256k         56 
524288        512k        512k         56 
24          7648k       1015k        134k
40          4100k       1059k         64k
48          1704k         43k         26k
56          1340k         67k         20k
72          4640k        540k         63k
80          2900k        622k         39k
88          8192        1320         112 
96          1404k        102k         19k
112           20k         12k        280 
120           12k       5160         168 
184           72k         67k       1008 
128           20k         14k        280 
152         4428k       3615k         60k
168          788k        356k         10k
160         8192         320         112 
104         3812k       1237k         52k
312          212k        209k       2968 
136           16k       2448         224 
Total         41M         12M        637k

String pool
entries		15310
identifiers	7154 (46.73%)
slots		32768
deleted		4403
bytes		89k (17592186044415M overhead)
table size	256k
coll/search	0.6751
ins/search	0.2777
avg. entry	5.97 bytes (+/- 7.90)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 989 elements, 0.520436 collisions
DECL_DEBUG_EXPR  hash: size 1021, 289 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 37013 disambiguations, 46240 queries
  ref_maybe_used_by_call_p: 3806 disambiguations, 37502 queries
  call_may_clobber_ref_p: 3777 disambiguations, 3777 queries

PTA query stats:
  pt_solution_includes: 805 disambiguations, 4942 queries
  pt_solutions_intersect: 15712 disambiguations, 3621558 queries

Revision 190401 memory stat:

[hjl@gnu-ivb-1 build_peak_lnx32e-gcc.0000]$ cat nohup.out 
Memory consumption before IPA
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             36k         11k       1080 
16            96k         82k       2112 
32           132k         62k       2376 
64          1460k       1338k         22k
256          440k        432k       6160 
512         8192        3584         112 
1024          16k         11k        224 
2048          12k       8192         168 
4096          84k         84k       1176 
8192          32k         32k        224 
16384         48k         48k        168 
65536         64k         64k         56 
131072        128k        128k         56 
262144        512k        512k        112 
24            68k         32k       1224 
40          3740k       1482k         58k
48          2108k        876k         32k
56          1396k        913k         21k
72            28k       2232         392 
80           104k        101k       1456 
88            16k       2200         224 
96          4252k       4140k         58k
112           16k         12k        224 
120         8192        5160         112 
184           72k         67k       1008 
128           36k         14k        504 
152         4324k       4168k         59k
168          788k        362k         10k
160          504k        487k       7056 
104         2836k       2770k         38k
312          212k        208k       2968 
136         4096        2448          56 
Total         23M         18M        331k

String pool
entries		10710
identifiers	7074 (66.05%)
slots		16384
deleted		3636
bytes		86k (17592186044415M overhead)
table size	128k
coll/search	0.6146
ins/search	0.2503
avg. entry	8.25 bytes (+/- 8.27)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 996 elements, 1.024948 collisions
DECL_DEBUG_EXPR  hash: size 1021, 294 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 0 disambiguations, 0 queries
  ref_maybe_used_by_call_p: 0 disambiguations, 0 queries
  call_may_clobber_ref_p: 0 disambiguations, 0 queries

PTA query stats:
  pt_solution_includes: 0 disambiguations, 0 queries
  pt_solutions_intersect: 0 disambiguations, 0 queries
Memory consumption after IPA
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             36k         11k       1080 
16            96k         82k       2112 
32           132k         70k       2376 
64          1460k       1160k         22k
256         1724k       1722k         23k
512         8192        5120         112 
1024          28k         27k        392 
2048          12k       8192         168 
4096         116k        116k       1624 
8192          32k         32k        224 
16384       4544k       4544k         15k
65536         64k         64k         56 
131072        128k        128k         56 
262144        512k        512k        112 
524288        512k        512k         56 
24           200k         74k       3600 
40          3740k       1280k         58k
48          2108k        425k         32k
56          1396k        910k         21k
72            28k       2232         392 
80          4824k       2840k         65k
88            16k       2024         224 
96          4252k       1954k         58k
112           16k         13k        224 
120         8192        6240         112 
184           72k         67k       1008 
128           36k         14k        504 
152         4320k       3675k         59k
168          788k        362k         10k
160          504k        441k       7056 
104         2836k       2327k         38k
312          212k        209k       2968 
136         4096        2448          56 
Total         33M         23M        431k

String pool
entries		10723
identifiers	7066 (65.90%)
slots		16384
deleted		3656
bytes		86k (17592186044415M overhead)
table size	128k
coll/search	0.6149
ins/search	0.2505
avg. entry	8.23 bytes (+/- 8.27)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 996 elements, 1.024948 collisions
DECL_DEBUG_EXPR  hash: size 1021, 290 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 0 disambiguations, 0 queries
  ref_maybe_used_by_call_p: 1022 disambiguations, 1 queries
  call_may_clobber_ref_p: 1022 disambiguations, 1022 queries

PTA query stats:
  pt_solution_includes: 294 disambiguations, 1048 queries
  pt_solutions_intersect: 2887 disambiguations, 289744 queries
Number of expanded macros:                         0

Line Table allocations during the compilation process
Number of ordinary maps used:            3 
Ordinary map used size:                120 
Number of ordinary maps allocated:     409 
Ordinary maps allocated size:           15k
Number of macro maps used:               0 
Macro maps used size:                    0 
Macro maps locations size:               0 
Macro maps size:                         0 
Duplicated maps locations size:          0 
Total allocated maps size:              15k
Total used maps size:                  120 

Memory still allocated at the end of the compilation process
Size   Allocated        Used    Overhead
8             24k         10k        720 
16          1132k        250k         24k
32          2648k        534k         46k
64          3884k       1108k         60k
256          504k        439k       7056 
512         4096        2048          56 
1024          12k       6144         168 
4096          88k         88k       1232 
8192          32k         32k        224 
16384         32k         32k        112 
32768         32k         32k         56 
65536        128k        128k        112 
262144        256k        256k         56 
524288        512k        512k         56 
24          7648k       1015k        134k
40          4100k       1059k         64k
48          1704k         43k         26k
56          1340k         67k         20k
72          4640k        540k         63k
80          2900k        622k         39k
88          8192        1320         112 
96          1404k        102k         19k
112           20k         12k        280 
120           12k       5160         168 
184           72k         67k       1008 
128           20k         14k        280 
152         4428k       3615k         60k
168          788k        356k         10k
160         8192         320         112 
104         3812k       1237k         52k
312          212k        209k       2968 
136           16k       2448         224 
Total         41M         12M        637k

String pool
entries		15310
identifiers	7154 (46.73%)
slots		32768
deleted		4403
bytes		89k (17592186044415M overhead)
table size	256k
coll/search	0.6751
ins/search	0.2777
avg. entry	5.97 bytes (+/- 7.90)
longest entry	46
(No per-node statistics)
Type hash: size 2039, 989 elements, 0.520436 collisions
DECL_DEBUG_EXPR  hash: size 1021, 289 elements, 0.128814 collisions
DECL_VALUE_EXPR  hash: size 1021, 0 elements, 0.000000 collisions
No gimple statistics
No RTX statistics

Alias oracle query stats:
  refs_may_alias_p: 37013 disambiguations, 46240 queries
  ref_maybe_used_by_call_p: 3806 disambiguations, 37502 queries
  call_may_clobber_ref_p: 3777 disambiguations, 3777 queries

PTA query stats:
  pt_solution_includes: 805 disambiguations, 4942 queries
  pt_solutions_intersect: 15712 disambiguations, 3621558 queries
Comment 4 H.J. Lu 2012-08-21 02:59:15 UTC
It was introduced between revision 189101 and revision 189664
on cxx-conversion branch.  Unfortunately, since branch was broken
between those 2 revisions, I can't bisect further.
Comment 5 Richard Biener 2012-08-21 08:26:03 UTC
(In reply to comment #4)
> It was introduced between revision 189101 and revision 189664
> on cxx-conversion branch.  Unfortunately, since branch was broken
> between those 2 revisions, I can't bisect further.

I only see

r189664 | dnovillo | 2012-07-19 16:40:11 +0200 (Thu, 19 Jul 2012) | 3 lines


        Merge from trunk rev 189106.

"inbetween" those two revisions. r189668 is another merge from trunk,
so is r188963.  So I suspect a mismerge in r189106, r188963 was

r188963 | dnovillo | 2012-06-25 23:10:33 +0200 (Mon, 25 Jun 2012) | 7 lines

Merge from trunk

Merged revisions 188725-188729,188731,188733,188738-188749,188751-188755,188759,
188764-188765,188771,188776,188778,188780-188789,188791-188796,188802-188808,188
814-188815,188818,188820-188824,188826-188827,188829,188833,188838,188840-188843
,188847,188849,188852-188853,188856-188860,188865-188872,188874-188876,188880-18
8881,188884-188885,188891,188893,188900-188902,188906-188909,188913,188915-18891
8,188922 via svnmerge from 
svn+ssh://gcc.gnu.org/svn/gcc/trunk

HJ, maybe you can help narrowing down things by trying different optimization
levels?  I see that using a memleak checker may not be possible with
10G memory usage.

Note that -fmem-report does not track all allocations, esp. heap hashtables
are not tracked.
Comment 6 dnovillo@google.com 2012-08-21 13:38:24 UTC
On 2012-08-20 22:59 , hjl.tools at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 02:59:15 UTC ---
> It was introduced between revision 189101 and revision 189664
> on cxx-conversion branch.  Unfortunately, since branch was broken
> between those 2 revisions, I can't bisect further.
>

There was no rev 189101 in cxx-conversion.  That is a trunk revision. 
In that range of revisions, there are only merges from trunk until rev 
188129, which introduces the new hash table.

Prior to that, we have rev 188059, which makes cosmetic changes to 
configure.ac.

If it's related to the hash table, then comparing rev 188059 vs rev 
188129 may show the regression.


Diego.
Comment 7 H.J. Lu 2012-08-21 13:58:05 UTC
(In reply to comment #6)
> 
> If it's related to the hash table, then comparing rev 188059 vs rev 
> 188129 may show the regression.
> 

Neither rev 188059 nor rev 188129 will build:

../../../../gcc/gcc/graphite-sese-to-poly.c: In function \u2018void build_sese_conditions_before(dom_walk_data*, basic_block)\u2019:
../../../../gcc/gcc/graphite-sese-to-poly.c:1357:2: error: call of overloaded \u2018VEC_safe_push_1(vec_t<gimple_statement_d*>**, NULL, const char [44], int, const char [29])\u2019 is ambiguous
../../../../gcc/gcc/graphite-sese-to-poly.c:1357:2: note: candidates are: 
In file included from ../../../../gcc/gcc/basic-block.h:25:0,
                 from ../../../../gcc/gcc/tree-flow.h:27,
                 from ../../../../gcc/gcc/graphite-sese-to-poly.c:24:
../../../../gcc/gcc/vec.h:674:1: note: T& VEC_safe_push_1(vec_t<T>**, T, const char*, unsigned int, const char*) [with T = gimple_statement_d*; vec_allocation_t A = (vec_allocation_t)0u]
../../../../gcc/gcc/vec.h:682:1: note: T* VEC_safe_push_1(vec_t<T>**, const T*, const char*, unsigned int, const char*) [with T = gimple_statement_d*; vec_allocation_t A = (vec_allocation_t)0u]
make[2]: *** [graphite-sese-to-poly.o] Error 1
                                                              2692,40       99%
Comment 8 dnovillo@google.com 2012-08-21 14:06:34 UTC
On 2012-08-21 09:58 , hjl.tools at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 13:58:05 UTC ---
> (In reply to comment #6)
>>
>> If it's related to the hash table, then comparing rev 188059 vs rev
>> 188129 may show the regression.
>>
>
> Neither rev 188059 nor rev 188129 will build:
>
> ../../../../gcc/gcc/graphite-sese-to-poly.c: In function \u2018void
> build_sese_conditions_before(dom_walk_data*, basic_block)\u2019:
> ../../../../gcc/gcc/graphite-sese-to-poly.c:1357:2: error: call of overloaded
> \u2018VEC_safe_push_1(vec_t<gimple_statement_d*>**, NULL, const char [44], int,
> const char [29])\u2019 is ambiguous
> ../../../../gcc/gcc/graphite-sese-to-poly.c:1357:2: note: candidates are:
> In file included from ../../../../gcc/gcc/basic-block.h:25:0,
>                   from ../../../../gcc/gcc/tree-flow.h:27,
>                   from ../../../../gcc/gcc/graphite-sese-to-poly.c:24:
> ../../../../gcc/gcc/vec.h:674:1: note: T& VEC_safe_push_1(vec_t<T>**, T, const
> char*, unsigned int, const char*) [with T = gimple_statement_d*;
> vec_allocation_t A = (vec_allocation_t)0u]
> ../../../../gcc/gcc/vec.h:682:1: note: T* VEC_safe_push_1(vec_t<T>**, const T*,
> const char*, unsigned int, const char*) [with T = gimple_statement_d*;
> vec_allocation_t A = (vec_allocation_t)0u]
> make[2]: *** [graphite-sese-to-poly.o] Error 1
>                                                                2692,40       99%
>

Huh, odd.  Can you try this patchlet on top of those revs?  It builds 
for me with this applied:

diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index cdabd73..5712e58 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -1354,7 +1354,7 @@ build_sese_conditions_before (struct dom_walk_data 
*dw_data,
        if (e->flags & EDGE_TRUE_VALUE)
         VEC_safe_push (gimple, heap, *cases, stmt);
        else
-       VEC_safe_push (gimple, heap, *cases, NULL);
+       VEC_safe_push (gimple, heap, *cases, (gimple) NULL);
      }

    gbb = gbb_from_bb (bb);
Comment 9 H.J. Lu 2012-08-21 16:20:37 UTC
Revision 188059 is bad:

f951: out of memory allocating 36872 bytes after a total of 583266304 bytes
Comment 10 dnovillo@google.com 2012-08-21 16:44:10 UTC
On Tue, Aug 21, 2012 at 12:20 PM, hjl.tools at gmail dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 16:20:37 UTC ---
> Revision 188059 is bad:
>
> f951: out of memory allocating 36872 bytes after a total of 583266304 bytes

Thanks.  Does rev 188129 show the same thing?  The next revisions to try are:

188040 (TREE_CHECK macros)
187954 (merge from trunk)
187836 (initial VEC conversion)
187735 (merge from trunk)

I now have access to SPEC2006, I'll try a build.
Comment 11 H.J. Lu 2012-08-21 16:51:24 UTC
It is caused by revision 187836:

http://gcc.gnu.org/ml/gcc-cvs/2012-05/msg00833.html

The C++ implementation of vec.[ch] has a massive memory leak.
Comment 12 Diego Novillo 2012-08-21 16:55:34 UTC
Thanks.  I'll work on a fix.
Comment 13 H.J. Lu 2012-08-21 17:10:09 UTC
It can be reproduced with -frecord-marker=4 -O -funswitch-loops.
Comment 14 H.J. Lu 2012-08-21 17:41:10 UTC
It failed even with

diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 3d650bf..30ac4b5 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -149,7 +149,7 @@ tree_ssa_loop_unswitch (void)
 static bool
 gate_tree_ssa_loop_unswitch (void)
 {
-  return flag_unswitch_loops != 0;
+  return false;
 }
 
 struct gimple_opt_pass pass_tree_unswitch =
Comment 15 H.J. Lu 2012-08-21 17:57:59 UTC
It failed with

diff --git a/gcc/passes.c b/gcc/passes.c
index b6fe18e..10174c4 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1449,7 +1449,6 @@ init_optimization_passes (void)
     NEXT_PASS (pass_lim);
     NEXT_PASS (pass_copy_prop);
     NEXT_PASS (pass_dce_loop);
-    NEXT_PASS (pass_tree_unswitch);
     NEXT_PASS (pass_scev_cprop);
     NEXT_PASS (pass_record_bounds);
     NEXT_PASS (pass_check_data_deps);

Somehow just processing the -funswitch-loops command-line option
triggers this problem.
Comment 16 H.J. Lu 2012-08-21 18:08:49 UTC
There are:

opts.c:typedef char *char_p; /* For DEF_VEC_P.  */
opts.c:DEF_VEC_P(char_p);
opts.c:DEF_VEC_ALLOC_P(char_p,heap);
opts-global.c:typedef const char *const_char_p; /* For DEF_VEC_P.  */
opts-global.c:DEF_VEC_P(const_char_p);
opts-global.c:DEF_VEC_ALLOC_P(const_char_p,heap);

Will they cause problems if other files define similar types?
Comment 17 dnovillo@google.com 2012-08-21 18:19:10 UTC
On 2012-08-21 14:08 , hjl.tools at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #16 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 18:08:49 UTC ---
> There are:
>
> opts.c:typedef char *char_p; /* For DEF_VEC_P.  */
> opts.c:DEF_VEC_P(char_p);
> opts.c:DEF_VEC_ALLOC_P(char_p,heap);
> opts-global.c:typedef const char *const_char_p; /* For DEF_VEC_P.  */
> opts-global.c:DEF_VEC_P(const_char_p);
> opts-global.c:DEF_VEC_ALLOC_P(const_char_p,heap);
>
> Will they cause problems if other files define similar types?
>

They shouldn't.  DEF_VEC_* expands to a NOP now.  The allocation 
strategy is only needed during the actual allocation call.  So, in the 
case of opts.c, that would be in add_comma_separated_to_vector() (the 
call to VEC_safe_push).

Those two vectors are only used for -finstrument-options..., though.  So 
that does not seem related.

The call to postpone_unknown_option_warning in opts-global.c seems also 
unrelated.  It's only used when processing unknown options.  That vector 
is popped when the unknown options are freed, so that can't be it either.
Comment 18 dnovillo@google.com 2012-08-21 18:31:51 UTC
OK, I think this is the hunk that's causing grief:

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 39f444f..35100d1 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -4392,7 +4392,6 @@ df_bb_verify (basic_block bb)
        if (!INSN_P (insn))
          continue;
        df_insn_refs_verify (&collection_rec, bb, insn, true);
-      df_free_collection_rec (&collection_rec);
      }

    /* Do the artificial defs and uses.  */


I remember that I ran into this during the VEC conversion 
(http://gcc.gnu.org/ml/gcc/2012-05/msg00271.html) and after some 
discussion I ended up convincing myself that taking it out was harmless. 
  Clearly, I was wrong.

I've hooked gdb to the running f951 and it's stuck in df_bb_verify().

Odd that this has not triggered anywhere else.
Comment 19 H.J. Lu 2012-08-21 18:54:45 UTC
(In reply to comment #15)
> It failed with
> 
> diff --git a/gcc/passes.c b/gcc/passes.c
> index b6fe18e..10174c4 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -1449,7 +1449,6 @@ init_optimization_passes (void)
>      NEXT_PASS (pass_lim);
>      NEXT_PASS (pass_copy_prop);
>      NEXT_PASS (pass_dce_loop);
> -    NEXT_PASS (pass_tree_unswitch);
>      NEXT_PASS (pass_scev_cprop);
>      NEXT_PASS (pass_record_bounds);
>      NEXT_PASS (pass_check_data_deps);
> 
> Somehow just processing the -funswitch-loops command-line option
> triggers this problem.

With --enable-gather-detailed-mem-stats, I got

Alloc-pool Kind         Elt size  Pools  Allocated (elts)            Peak (elts)            Leak (elts)

-df_scan ref base           64         18   24808192(    387628)   11869056(    185454)          0(         0)
-df_scan ref artificial     72         18   15168528(    210674)    2044944(     28402)          0(         0)
+df_scan ref base           64         18  513091264(   8017051)  500077440(   7813710)          0(         0)
+df_scan ref artificial     72         18  599905368(   8332019)    2044944(     28402)          0(         0)
 elt_loc_list               32         27    7982112(    249441)    2399488(     74984)          0(         0)  
-df_scan ref regular        72         18   71483184(    992822)   45955584(    638272)          0(         0)
+df_scan ref regular        72         18 2091195360(  29044380) 2065579776(  28688608)          0(         0)
 df_scan insn               56         18    7681016(    137161)    3340848(     59658)          0(         0)

-Total                              15775  253131240
+Total                              16067 3345899232
Comment 20 dnovillo@google.com 2012-08-21 19:07:33 UTC
On 2012-08-21 14:54 , hjl.tools at gmail dot com wrote:

> With --enable-gather-detailed-mem-stats, I got
>
> Alloc-pool Kind         Elt size  Pools  Allocated (elts)            Peak
> (elts)            Leak (elts)
>
> -df_scan ref base           64         18   24808192(    387628)   11869056(
> 185454)          0(         0)
> -df_scan ref artificial     72         18   15168528(    210674)    2044944(
>   28402)          0(         0)
> +df_scan ref base           64         18  513091264(   8017051)  500077440(
> 7813710)          0(         0)
> +df_scan ref artificial     72         18  599905368(   8332019)    2044944(
>   28402)          0(         0)
>   elt_loc_list               32         27    7982112(    249441)    2399488(
>   74984)          0(         0)
> -df_scan ref regular        72         18   71483184(    992822)   45955584(
> 638272)          0(         0)
> +df_scan ref regular        72         18 2091195360(  29044380) 2065579776(
> 28688608)          0(         0)
>   df_scan insn               56         18    7681016(    137161)    3340848(
>   59658)          0(         0)
>
> -Total                              15775  253131240
> +Total                              16067 3345899232
>

That agrees with what I found, thanks.  I've added a link to the 
discussion about the df verifier.  The vectors need to be cleared, but I 
can't just free the vectors:

Stack vectors must be initially allocated with VEC_stack_alloc.
gcc/df-scan.c: In function 'unsigned int df_count_refs(bool, bool, bool)':
gcc/df-scan.c:1507:1: internal compiler error: in vec_reserve, at vec.h:1111
  }
Comment 21 Steven Bosscher 2012-08-21 19:19:58 UTC
(In reply to comment #18)
> Odd that this has not triggered anywhere else.

It may have triggered elsewhere, see PR54343 ...
Comment 22 H.J. Lu 2012-08-21 19:27:50 UTC
This seems to work:

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 35100d1..39f444f 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -4392,6 +4392,7 @@ df_bb_verify (basic_block bb)
       if (!INSN_P (insn))
         continue;
       df_insn_refs_verify (&collection_rec, bb, insn, true);
+      df_free_collection_rec (&collection_rec);
     }
 
   /* Do the artificial defs and uses.  */
diff --git a/gcc/vec.h b/gcc/vec.h
index cc7e819..3a298ff 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1031,21 +1031,9 @@ vec_reserve (vec_t<T> *vec_, int reserve MEM_STAT_DECL)
 					      sizeof (T), false
 					      PASS_MEM_STAT);
   else
-    {
-      /* Only allow stack vectors when re-growing them.  The initial
-	 allocation of stack vectors must be done with the
-	 VEC_stack_alloc macro, because it uses alloca() for the
-	 allocation.  */
-      if (vec_ == NULL)
-	{
-	  fprintf (stderr, "Stack vectors must be initially allocated "
-		   "with VEC_stack_alloc.\n");
-	  gcc_unreachable ();
-	}
-      return (vec_t<T> *) vec_stack_o_reserve (vec_, reserve,
-					       offsetof (vec_t<T>, vec),
-					       sizeof (T) PASS_MEM_STAT);
-    }
+    return (vec_t<T> *) vec_stack_o_reserve (vec_, reserve,
+					     offsetof (vec_t<T>, vec),
+					     sizeof (T) PASS_MEM_STAT);
 }
Comment 23 dnovillo@google.com 2012-08-21 19:50:12 UTC
On 2012-08-21 15:27 , hjl.tools at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #22 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 19:27:50 UTC ---
> This seems to work:
>
> diff --git a/gcc/df-scan.c b/gcc/df-scan.c
> index 35100d1..39f444f 100644
> --- a/gcc/df-scan.c
> +++ b/gcc/df-scan.c
> @@ -4392,6 +4392,7 @@ df_bb_verify (basic_block bb)
>         if (!INSN_P (insn))
>           continue;
>         df_insn_refs_verify (&collection_rec, bb, insn, true);
> +      df_free_collection_rec (&collection_rec);
>       }
>
>     /* Do the artificial defs and uses.  */
> diff --git a/gcc/vec.h b/gcc/vec.h
> index cc7e819..3a298ff 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -1031,21 +1031,9 @@ vec_reserve (vec_t<T> *vec_, int reserve MEM_STAT_DECL)
>                             sizeof (T), false
>                             PASS_MEM_STAT);
>     else
> -    {
> -      /* Only allow stack vectors when re-growing them.  The initial
> -     allocation of stack vectors must be done with the
> -     VEC_stack_alloc macro, because it uses alloca() for the
> -     allocation.  */
> -      if (vec_ == NULL)
> -    {
> -      fprintf (stderr, "Stack vectors must be initially allocated "
> -           "with VEC_stack_alloc.\n");
> -      gcc_unreachable ();
> -    }
> -      return (vec_t<T> *) vec_stack_o_reserve (vec_, reserve,
> -                           offsetof (vec_t<T>, vec),
> -                           sizeof (T) PASS_MEM_STAT);
> -    }
> +    return (vec_t<T> *) vec_stack_o_reserve (vec_, reserve,
> +                         offsetof (vec_t<T>, vec),
> +                         sizeof (T) PASS_MEM_STAT);
>   }
>

The problem with this is that you are switching a stack vec into a heap 
vec.  This may not always be what the caller wanted.


The other alternative is to truncate the vectors after the call to 
df_insn_refs_verify().
Comment 24 H.J. Lu 2012-08-21 19:53:14 UTC
(In reply to comment #23)
> 
> The problem with this is that you are switching a stack vec into a heap 
> vec.  This may not always be what the caller wanted.

My patch just restores the old behavior.

> 
> The other alternative is to truncate the vectors after the call to 
> df_insn_refs_verify().

This should be a separate patch, not the part of C++ conversion.
Comment 25 dnovillo@google.com 2012-08-21 20:49:16 UTC
On 2012-08-21 15:53 , hjl.tools at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54332
>
> --- Comment #24 from H.J. Lu <hjl.tools at gmail dot com> 2012-08-21 19:53:14 UTC ---
> (In reply to comment #23)
>>
>> The problem with this is that you are switching a stack vec into a heap
>> vec.  This may not always be what the caller wanted.
>
> My patch just restores the old behavior.

You are right.  This was always the case.  I added the extra check to 
guard against inadvertent *initial* heap allocations for stack vectors.

But now that I see the old code, this was always the case.  The 
subsequent stack operations after the first round around the loop will 
move the stacks into the heap.

The patch is OK with a ChangeLog and bootstrap testing.


Thanks!  Diego.
Comment 26 hjl@gcc.gnu.org 2012-08-21 21:07:07 UTC
Author: hjl
Date: Tue Aug 21 21:07:01 2012
New Revision: 190576

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190576
Log:
Restore df_free_collection_rec call in df_bb_verify

	PR middle-end/54332
	* df-scan.c (df_bb_verify): Restore df_free_collection_rec call
	inside the insn traversal loop.

	* vec.h (vec_reserve): Remove the stack allocation check.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/df-scan.c
    trunk/gcc/vec.h
Comment 27 H.J. Lu 2012-08-21 21:11:11 UTC
Fixed.
Comment 28 Steven Bosscher 2012-08-22 08:59:04 UTC
*** Bug 54269 has been marked as a duplicate of this bug. ***