Compiling the GCC testsuite file alias3.C with -O1 -finline-functions exhausts memory on checking=all or =yes builds of GCC 4.1.1, on Ubuntu 5.04 with virtual memory limited to 500M. Roughly similar behavior on Mac OSX 10.4.6 with a checking=all build of GCC 4.1.0, though ulimit is broken on the Mac, so I had to kill the process manually. A checking=all build of GCC 4.0.2 compiles without error. The symptoms are similar to my PR26774, but that's fixed in 4.1.1, and didn't require optimization. Here's the Delta-reduced file: namespace A{ struct X{}; void f(X&); namespace a_very_long_namespace_name{ } } namespace B = A; void B::f(A::X& x) { B::f(x); f(x); } ============ Here's the session; I'll attach the preprocessed file. 131> /opt/gcc411-chk-all/bin/gcc -v -save-temps -c -O1 -finline-functions ../cpp/bugfiles/GCC_bugfiles/noerror/131415_alias3_min.cpp Using built-in specs. Target: i686-pc-linux-gnu Configured with: /third-party_source/gcc-4.1.1/configure --enable-checking=all --prefix=/opt/gcc411-chk-all/ --enable-languages=c,c++ --with-comment=PalmSource checking=all build by Flash Sheridan 5/31/06 on Kondal Thread model: posix gcc version 4.1.1 /home/opt/gcc411-chk-all/bin/../libexec/gcc/i686-pc-linux-gnu/4.1.1/cc1plus -E -quiet -v -iprefix /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/ -D_GNU_SOURCE ../cpp/bugfiles/GCC_bugfiles/noerror/131415_alias3_min.cpp -mtune=pentiumpro -finline-functions -O1 -fpch-preprocess -o 131415_alias3_min.ii ignoring nonexistent directory "/home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../i686-pc-linux-gnu/include" ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1" ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/i686-pc-linux-gnu" ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/backward" ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/include" ignoring nonexistent directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1 /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/i686-pc-linux-gnu /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/backward /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/include /usr/local/include /opt/gcc411-chk-all//include /usr/include End of search list. /home/opt/gcc411-chk-all/bin/../libexec/gcc/i686-pc-linux-gnu/4.1.1/cc1plus -fpreprocessed 131415_alias3_min.ii -quiet -dumpbase 131415_alias3_min.cpp -mtune=pentiumpro -auxbase 131415_alias3_min -O1 -version -finline-functions -o 131415_alias3_min.s GNU C++ version 4.1.1 (i686-pc-linux-gnu) compiled by GNU C version 3.3.5 (Debian 1:3.3.5-8ubuntu2). GGC heuristics: --param ggc-min-expand=0 --param ggc-min-heapsize=0 Compiler executable checksum: 0017ab69adc149664c23a10e9b6eba83 virtual memory exhausted: Cannot allocate memory --- http://pobox.com/~flash Quality Lead for Compilers and Debuggers PalmSource, Inc. Tools Quality Assurance PalmSource bug 131415
Created attachment 11583 [details] Preprocessed Delta-reduced source file
Confirmed. Also fails for release checking.
Recursive inlining causes memory usage to grow exponentially. The current default limit is DEFPARAM (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO, "max-inline-recursive-depth-auto", "The maximum depth of recursive inlining for non-inline functions", 8, 0, 0) and for illustration, here's memory growth with changing this parameter: 1 3.2 MiB 2 3.2 MiB 3 5.5 MiB 4 killed after using 495 MiB we simply create a lot of temporares and calls and basic blocks. Which some place of gcc doesn't very much like (t24.fixupcfg dump): ;; Function void A::f(A::X&) (_ZN1A1fERNS_1XE) Removing basic block 27 Removing basic block 26 Removing basic block 20 Removing basic block 13 Removing basic block 12 Removing basic block 6 Merging blocks 0 and 1 Merging blocks 0 and 2 Merging blocks 0 and 3 Merging blocks 0 and 4 Merging blocks 0 and 5 Merging blocks 0 and 7 Merging blocks 0 and 8 Merging blocks 0 and 9 Merging blocks 0 and 10 Merging blocks 0 and 11 Merging blocks 0 and 14 Merging blocks 0 and 15 Merging blocks 0 and 16 Merging blocks 0 and 17 Merging blocks 0 and 18 Merging blocks 0 and 19 Merging blocks 0 and 21 Merging blocks 0 and 22 Merging blocks 0 and 23 Merging blocks 0 and 24 Merging blocks 0 and 25 Merging blocks 0 and 28 void A::f(A::X&) (x) { struct X & x; <repeat that line 16382(!!) times> <bb 0>: x = x; x = x; x = x; f (x); f (x); x = x; f (x); f (x); x = x; x = x; f (x); f (x); x = x; f (x); f (x); x = x; x = x; x = x; f (x); f (x); x = x; f (x); f (x); x = x; x = x; f (x); f (x); x = x; f (x); f (x); return; }
That was with recursion depth 3. With depth 2 we have 63 temporaries, with depth 1 there are 3. Now guess what would be the number for a depth of 4. Note this problem is fixed in 4.2. Anyone remembers which patch could have done that?
I'm checking if it was fixed by URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=109379 Log: 2006-01-05 Richard Guenther <rguenther@suse.de> Diego Novillo <dnovillo@redhat.com> * tree-pass.h (TODO_remove_unused_locals): Define. * gimple-low.c (expand_var_p, remove_useless_vars, pass_remove_useless_vars): Remove. Update all users. * tree-ssa-live.c (mark_all_vars_used_1): Handle SSA names. (remove_unused_locals): New function. * tree-flow.h (remove_unused_locals): Declare. * passes.c (execute_todo): Call remove_unused_locals if TODO_remove_unused_locals is set. * tree-into-ssa.c (pass_build_ssa): Add TODO_remove_unused_locals. * tree-ssa-dce.c (pass_dce): Likewise. * tree-outof-ssa.c (pass_del_ssa): Likewise.
No, it wasn't. Janis, can you hunt this?
A regression hunt on powerpc-linux using the testcase in the description with "ulimit -v 500000" identified this patch as the start of the failures: http://gcc.gnu.org/viewcvs?view=rev&rev=102521 r102521 | hubicka | 2005-07-28 21:45:27 +0000 (Thu, 28 Jul 2005) The test still fails on mainline with r109576 and passes with r109587, so something on 2006-01-11 in between those revisions fixes it. Builds are broken in that range, but I'm adapting my reghunt setup to handle that so I might come up with an answer.
The failures stop on mainline with this patch: http://gcc.gnu.org/viewcvs?view=rev&rev=109580 r109580 | hubicka | 2006-01-11 13:13:37 +0000 (Wed, 11 Jan 2006)
The following simpler test case is sufficient to show the same behavior: struct X{}; void f(X& x) { f(x); f(x); } Also, it is indeed true that --param max-inline-recursive-depth-auto=3 makes this compile instantaneously, but a value of 4 makes it go for a long time. I understand expoentials, but 2^4 isn't that big a number, so I wonder if we're hitting something else super-linear in here -- perhaps something that still in later releases as well? I am going to downgrade this to P2, as normally -finline-functions is only used with -O3, and as the --param option provides a work-around.
It's true that the number of created calls is 2^N, but unfortunately the number of created temporaries grows super-exponential: --param max-inline-recursive-depth-auto grep 'struct X' t.C.t24.fixupcfg | wc -l 1 3 2 63 3 16383 (!) So it grows like n_i = (2*(n_{i-1}+1))**2 - 1 with n_1 = 3. For 4 we would have 1073741823, for 5 we get 4611686018427387904 number of temporaries ;) Honza's patch (comment #8) fixes this on the mainline, but I guess porting that back is not really an option. We might instead lower the default value of max-inline-recursive-depth[-auto], which is currently 8. From the above numbers a limit of 2 should be appropriate. Or we can make it count the number of functions inlined, not the depth, to avoid exponential behavior with multiple calls to self.
Closing 4.1 branch.