Whole program optimization and functions-only-called-once.

Richard Guenther richard.guenther@gmail.com
Sun Nov 22 18:01:00 GMT 2009


On Sun, Nov 22, 2009 at 5:18 AM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> Toon Moene wrote:
>>
>> Toon Moene wrote:
>>
>>> Jan Hubicka wrote:
>>
>>>> :) It would be nice to know what caused the OOM.  Is just one of passes
>>>> exploding
>>>> on presence of very large bodies?
>>>
>>> I'll try to figure this out over the weekend (sorry, don't have more
>>> spare time).
>>>
>>> It's most probably a single pass, because the memory requirements kept
>>> creeping up to 12.5 Gbytes from 10, slowly increasing all the time over
>>> several minutes.
>>
>> Here are the tracebacks from gdb attached to the lto1 process, while it
>> was expanding from 7 to 12 Gb:
>>
>> (gdb) where
>> #0  0x00002b961290491e in memset () from /lib/libc.so.6
>> #1  0x0000000000530632 in create_loop_tree_nodes (loops_p=1 '\001') at
>> ../../gcc/gcc/ira-build.c:155
>> #2  ira_build (loops_p=1 '\001') at ../../gcc/gcc/ira-build.c:2773
>> #3  0x000000000052a3db in ira () at ../../gcc/gcc/ira.c:3179
>> #4  rest_of_handle_ira () at ../../gcc/gcc/ira.c:3350
>> #5  0x00000000005867ff in execute_one_pass (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1522
>> #6  0x0000000000586a75 in execute_pass_list (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1577
>> #7  0x0000000000586a87 in execute_pass_list (pass=0xdb0a20) at
>> ../../gcc/gcc/passes.c:1578
>> #8  0x0000000000656e1c in tree_rest_of_compilation (fndecl=0x2b961f5d6a00)
>> at ../../gcc/gcc/tree-optimize.c:407
>> #9  0x0000000000781c8c in cgraph_expand_function (node=0x2b9618367000) at
>> ../../gcc/gcc/cgraphunit.c:1178
>> #10 0x00000000007835ed in cgraph_expand_all_functions () at
>> ../../gcc/gcc/cgraphunit.c:1245
>> #11 cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1491
>> #12 0x00000000004165cf in lto_main (debug_p=<value optimized out>) at
>> ../../gcc/gcc/lto/lto.c:2054
>> #13 0x000000000061a28e in compile_file (argc=1244, argv=0x291cfb0) at
>> ../../gcc/gcc/toplev.c:1049
>> #14 do_compile (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2404
>> #15 toplev_main (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2446
>> #16 0x00002b96128a7a8d in __libc_start_main () from /lib/libc.so.6
>> #17 0x0000000000400249 in _start () at ../sysdeps/x86_64/elf/start.S:113
>>
>> (gdb) where
>> #0  0x00002b961290490f in memset () from /lib/libc.so.6
>> #1  0x0000000000530632 in create_loop_tree_nodes (loops_p=1 '\001') at
>> ../../gcc/gcc/ira-build.c:155
>> #2  ira_build (loops_p=1 '\001') at ../../gcc/gcc/ira-build.c:2773
>> #3  0x000000000052a3db in ira () at ../../gcc/gcc/ira.c:3179
>> #4  rest_of_handle_ira () at ../../gcc/gcc/ira.c:3350
>> #5  0x00000000005867ff in execute_one_pass (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1522
>> #6  0x0000000000586a75 in execute_pass_list (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1577
>> #7  0x0000000000586a87 in execute_pass_list (pass=0xdb0a20) at
>> ../../gcc/gcc/passes.c:1578
>> #8  0x0000000000656e1c in tree_rest_of_compilation (fndecl=0x2b961f5d6a00)
>> at ../../gcc/gcc/tree-optimize.c:407
>> #9  0x0000000000781c8c in cgraph_expand_function (node=0x2b9618367000) at
>> ../../gcc/gcc/cgraphunit.c:1178
>> #10 0x00000000007835ed in cgraph_expand_all_functions () at
>> ../../gcc/gcc/cgraphunit.c:1245
>> #11 cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1491
>> #12 0x00000000004165cf in lto_main (debug_p=<value optimized out>) at
>> ../../gcc/gcc/lto/lto.c:2054
>> #13 0x000000000061a28e in compile_file (argc=1244, argv=0x291cfb0) at
>> ../../gcc/gcc/toplev.c:1049
>> #14 do_compile (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2404
>> #15 toplev_main (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2446
>> #16 0x00002b96128a7a8d in __libc_start_main () from /lib/libc.so.6
>> #17 0x0000000000400249 in _start () at ../sysdeps/x86_64/elf/start.S:113
>>
>> (gdb) where
>> #0  0x00002b96128fc26c in ?? () from /lib/libc.so.6
>> #1  0x00002b96128fde24 in calloc () from /lib/libc.so.6
>> #2  0x0000000000a6ea7a in xcalloc (nelem=19, elsize=8) at
>> ../../gcc/libiberty/xmalloc.c:162
>> #3  0x000000000099d8c0 in get_loop_body (loop=0x2b966e38ecf0) at
>> ../../gcc/gcc/cfgloop.c:819
>> #4  0x000000000099e14c in get_loop_exit_edges (loop=0x2b966e38ecf0) at
>> ../../gcc/gcc/cfgloop.c:1157
>> #5  0x0000000000530632 in create_loop_tree_nodes (loops_p=1 '\001') at
>> ../../gcc/gcc/ira-build.c:155
>> #6  ira_build (loops_p=1 '\001') at ../../gcc/gcc/ira-build.c:2773
>> #7  0x000000000052a3db in ira () at ../../gcc/gcc/ira.c:3179
>> #8  rest_of_handle_ira () at ../../gcc/gcc/ira.c:3350
>> #9  0x00000000005867ff in execute_one_pass (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1522
>> #10 0x0000000000586a75 in execute_pass_list (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1577
>> #11 0x0000000000586a87 in execute_pass_list (pass=0xdb0a20) at
>> ../../gcc/gcc/passes.c:1578
>> #12 0x0000000000656e1c in tree_rest_of_compilation (fndecl=0x2b961f5d6a00)
>> at ../../gcc/gcc/tree-optimize.c:407
>> #13 0x0000000000781c8c in cgraph_expand_function (node=0x2b9618367000) at
>> ../../gcc/gcc/cgraphunit.c:1178
>> #14 0x00000000007835ed in cgraph_expand_all_functions () at
>> ../../gcc/gcc/cgraphunit.c:1245
>> #15 cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1491
>> #16 0x00000000004165cf in lto_main (debug_p=<value optimized out>) at
>> ../../gcc/gcc/lto/lto.c:2054
>> #17 0x000000000061a28e in compile_file (argc=1244, argv=0x291cfb0) at
>> ../../gcc/gcc/toplev.c:1049
>> #18 do_compile (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2404
>> #19 toplev_main (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2446
>> #20 0x00002b96128a7a8d in __libc_start_main () from /lib/libc.so.6
>> #21 0x0000000000400249 in _start () at ../sysdeps/x86_64/elf/start.S:113
>>
>> (gdb) where
>> #0  0x00002b961290491e in memset () from /lib/libc.so.6
>> #1  0x0000000000530632 in create_loop_tree_nodes (loops_p=1 '\001') at
>> ../../gcc/gcc/ira-build.c:155
>> #2  ira_build (loops_p=1 '\001') at ../../gcc/gcc/ira-build.c:2773
>> #3  0x000000000052a3db in ira () at ../../gcc/gcc/ira.c:3179
>> #4  rest_of_handle_ira () at ../../gcc/gcc/ira.c:3350
>> #5  0x00000000005867ff in execute_one_pass (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1522
>> #6  0x0000000000586a75 in execute_pass_list (pass=0xd2f500) at
>> ../../gcc/gcc/passes.c:1577
>> #7  0x0000000000586a87 in execute_pass_list (pass=0xdb0a20) at
>> ../../gcc/gcc/passes.c:1578
>> #8  0x0000000000656e1c in tree_rest_of_compilation (fndecl=0x2b961f5d6a00)
>> at ../../gcc/gcc/tree-optimize.c:407
>> #9  0x0000000000781c8c in cgraph_expand_function (node=0x2b9618367000) at
>> ../../gcc/gcc/cgraphunit.c:1178
>> #10 0x00000000007835ed in cgraph_expand_all_functions () at
>> ../../gcc/gcc/cgraphunit.c:1245
>> #11 cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1491
>> #12 0x00000000004165cf in lto_main (debug_p=<value optimized out>) at
>> ../../gcc/gcc/lto/lto.c:2054
>> #13 0x000000000061a28e in compile_file (argc=1244, argv=0x291cfb0) at
>> ../../gcc/gcc/toplev.c:1049
>> #14 do_compile (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2404
>> #15 toplev_main (argc=1244, argv=0x291cfb0) at ../../gcc/gcc/toplev.c:2446
>> #16 0x00002b96128a7a8d in __libc_start_main () from /lib/libc.so.6
>> #17 0x0000000000400249 in _start () at ../sysdeps/x86_64/elf/start.S:113
>>
>> So it seems to be stuck in a part of the IRA pass ...
>>
>> Hope this helps (it's close to impossible to build a test case out of
>> this, because the programs consists of around 3/4 of our ~ 1 million lines
>> of Fortran code.
>>
> I'd recommend to try -fira-region=one and to see what memory requirements
> would be.
>
> For such big function the conflict table would be very big.  This is a
> common problem for RA using the conflict table.  IRA uses sophisticated
> algorithm for conflict table compression.  I even have no idea now how to
> improve it.  IRA has a parameter ira-max-conflict-table-size which affects
> the decision to use the conflict table.  If the conflict table is decided
> not to be used, the quality of RA worsens.  The default value is 1GB.  But I
> guess the conflict table in your case would be bigger.  So you need to play
> with this parameter.
>
> If -fira-region work for you, we could prohibit regional allocation for
> functions containing basic blocks which number is more than some threshold.

Can't we split a function at points of minimal # of life pseudos and allocate
the resulting regions independently?  Of course there would be hard
constraints on the entry of each such region, just like we have on
function entry
for parameters.

No idea if that would help in practice, of course.

Richard.



More information about the Gcc-patches mailing list