Bug 27881 - [4.1 Regression] Memory exhausted with -finline-functions on testsuite file alias3.C
Summary: [4.1 Regression] Memory exhausted with -finline-functions on testsuite file a...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.1.1
: P2 normal
Target Milestone: 4.2.0
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2006-06-03 02:36 UTC by Flash Sheridan
Modified: 2008-07-04 15:33 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.2.0 4.0.3
Known to fail: 4.1.0 4.1.1 4.1.3
Last reconfirmed: 2006-06-03 20:32:49


Attachments
Preprocessed Delta-reduced source file (185 bytes, text/plain)
2006-06-03 02:37 UTC, Flash Sheridan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Flash Sheridan 2006-06-03 02:36:18 UTC
Compiling the GCC testsuite file alias3.C with -O1 -finline-functions exhausts memory on checking=all or =yes builds of GCC 4.1.1, on Ubuntu 5.04 with virtual memory limited to 500M.  Roughly similar behavior on Mac OSX 10.4.6 with a checking=all build of GCC 4.1.0, though ulimit is broken on the Mac, so I had to kill the process manually.  A checking=all build of GCC 4.0.2 compiles without error.
    The symptoms are similar to my PR26774, but that's fixed in 4.1.1, and didn't require optimization.
    Here's the Delta-reduced file:

namespace A{
  struct X{};
  void f(X&);
  namespace a_very_long_namespace_name{
  }
}
namespace B = A;
void B::f(A::X& x)
{
  B::f(x);
  f(x);
}


============
Here's the session; I'll attach the preprocessed file.

131> /opt/gcc411-chk-all/bin/gcc -v -save-temps -c -O1 -finline-functions   ../cpp/bugfiles/GCC_bugfiles/noerror/131415_alias3_min.cpp
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /third-party_source/gcc-4.1.1/configure --enable-checking=all --prefix=/opt/gcc411-chk-all/ --enable-languages=c,c++ --with-comment=PalmSource checking=all build by Flash Sheridan 5/31/06 on Kondal
Thread model: posix
gcc version 4.1.1
 /home/opt/gcc411-chk-all/bin/../libexec/gcc/i686-pc-linux-gnu/4.1.1/cc1plus -E -quiet -v -iprefix /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/ -D_GNU_SOURCE ../cpp/bugfiles/GCC_bugfiles/noerror/131415_alias3_min.cpp -mtune=pentiumpro -finline-functions -O1 -fpch-preprocess -o 131415_alias3_min.ii
ignoring nonexistent directory "/home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../i686-pc-linux-gnu/include"
ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1"
ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/i686-pc-linux-gnu"
ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/backward"
ignoring duplicate directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/include"
ignoring nonexistent directory "/opt/gcc411-chk-all//lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1
 /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/i686-pc-linux-gnu
 /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/../../../../include/c++/4.1.1/backward
 /home/opt/gcc411-chk-all/bin/../lib/gcc/i686-pc-linux-gnu/4.1.1/include
 /usr/local/include
 /opt/gcc411-chk-all//include
 /usr/include
End of search list.
 /home/opt/gcc411-chk-all/bin/../libexec/gcc/i686-pc-linux-gnu/4.1.1/cc1plus -fpreprocessed 131415_alias3_min.ii -quiet -dumpbase 131415_alias3_min.cpp -mtune=pentiumpro -auxbase 131415_alias3_min -O1 -version -finline-functions -o 131415_alias3_min.s
GNU C++ version 4.1.1 (i686-pc-linux-gnu)
        compiled by GNU C version 3.3.5 (Debian 1:3.3.5-8ubuntu2).
GGC heuristics: --param ggc-min-expand=0 --param ggc-min-heapsize=0
Compiler executable checksum: 0017ab69adc149664c23a10e9b6eba83
virtual memory exhausted: Cannot allocate memory


---
http://pobox.com/~flash
Quality Lead for Compilers and Debuggers
PalmSource, Inc. Tools Quality Assurance
PalmSource bug 131415
Comment 1 Flash Sheridan 2006-06-03 02:37:23 UTC
Created attachment 11583 [details]
Preprocessed Delta-reduced source file
Comment 2 Richard Biener 2006-06-03 20:32:49 UTC
Confirmed.  Also fails for release checking.
Comment 3 Richard Biener 2006-06-07 13:43:54 UTC
Recursive inlining causes memory usage to grow exponentially.  The current default limit is

DEFPARAM (PARAM_MAX_INLINE_RECURSIVE_DEPTH_AUTO,
          "max-inline-recursive-depth-auto",
          "The maximum depth of recursive inlining for non-inline functions",
          8, 0, 0)

and for illustration, here's memory growth with changing this parameter:

1    3.2 MiB
2    3.2 MiB
3    5.5 MiB
4    killed after using 495 MiB

we simply create a lot of temporares and calls and basic blocks.  Which some
place of gcc doesn't very much like (t24.fixupcfg dump):

;; Function void A::f(A::X&) (_ZN1A1fERNS_1XE)

Removing basic block 27
Removing basic block 26
Removing basic block 20
Removing basic block 13
Removing basic block 12
Removing basic block 6
Merging blocks 0 and 1
Merging blocks 0 and 2
Merging blocks 0 and 3
Merging blocks 0 and 4
Merging blocks 0 and 5
Merging blocks 0 and 7
Merging blocks 0 and 8
Merging blocks 0 and 9
Merging blocks 0 and 10
Merging blocks 0 and 11
Merging blocks 0 and 14
Merging blocks 0 and 15
Merging blocks 0 and 16
Merging blocks 0 and 17
Merging blocks 0 and 18
Merging blocks 0 and 19
Merging blocks 0 and 21
Merging blocks 0 and 22
Merging blocks 0 and 23
Merging blocks 0 and 24
Merging blocks 0 and 25
Merging blocks 0 and 28
void A::f(A::X&) (x)
{
  struct X & x;
<repeat that line 16382(!!) times>

<bb 0>:
  x = x;
  x = x;
  x = x;
  f (x);
  f (x);
  x = x;
  f (x);
  f (x);
  x = x;
  x = x;
  f (x);
  f (x);
  x = x;
  f (x);
  f (x);
  x = x;
  x = x;
  x = x;
  f (x);
  f (x);
  x = x;
  f (x);
  f (x);
  x = x;
  x = x;
  f (x);
  f (x);
  x = x;
  f (x);
  f (x);
  return;

}
Comment 4 Richard Biener 2006-06-07 13:49:27 UTC
That was with recursion depth 3.  With depth 2 we have 63 temporaries, with depth 1 there are 3.  Now guess what would be the number for a depth of 4.

Note this problem is fixed in 4.2.  Anyone remembers which patch could have done that?
Comment 5 Richard Biener 2006-06-07 14:01:15 UTC
I'm checking if it was fixed by

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=109379
Log:
2006-01-05  Richard Guenther  <rguenther@suse.de>
	    Diego Novillo  <dnovillo@redhat.com>

	* tree-pass.h (TODO_remove_unused_locals): Define.
	* gimple-low.c (expand_var_p, remove_useless_vars,
	pass_remove_useless_vars): Remove.  Update all users.
	* tree-ssa-live.c (mark_all_vars_used_1): Handle SSA names.
	(remove_unused_locals): New function.
	* tree-flow.h (remove_unused_locals): Declare.
	* passes.c (execute_todo): Call remove_unused_locals if
	TODO_remove_unused_locals is set.
	* tree-into-ssa.c (pass_build_ssa): Add TODO_remove_unused_locals.
	* tree-ssa-dce.c (pass_dce): Likewise.
	* tree-outof-ssa.c (pass_del_ssa): Likewise.
Comment 6 Richard Biener 2006-06-07 14:23:38 UTC
No, it wasn't.  Janis, can you hunt this?
Comment 7 Janis Johnson 2006-06-07 22:44:19 UTC
A regression hunt on powerpc-linux using the testcase in the description with "ulimit -v 500000" identified this patch as the start of the failures:

    http://gcc.gnu.org/viewcvs?view=rev&rev=102521

    r102521 | hubicka | 2005-07-28 21:45:27 +0000 (Thu, 28 Jul 2005)

The test still fails on mainline with r109576 and passes with r109587, so something on 2006-01-11 in between those revisions fixes it.  Builds are broken in that range, but I'm adapting my reghunt setup to handle that so I might come up with an answer.
Comment 8 Janis Johnson 2006-06-07 23:19:06 UTC
The failures stop on mainline with this patch:

    http://gcc.gnu.org/viewcvs?view=rev&rev=109580

    r109580 | hubicka | 2006-01-11 13:13:37 +0000 (Wed, 11 Jan 2006)
Comment 9 Mark Mitchell 2006-11-14 17:25:42 UTC
The following simpler test case is sufficient to show the same behavior:

struct X{};
void f(X& x)
{
  f(x);
  f(x);
}

Also, it is indeed true that --param max-inline-recursive-depth-auto=3 makes this compile instantaneously, but a value of 4 makes it go for a long time.  I understand expoentials, but 2^4 isn't that big a number, so I wonder if we're hitting something else super-linear in here -- perhaps something that still in later releases as well?

I am going to downgrade this to P2, as normally -finline-functions is only used with -O3, and as the --param option provides a work-around.
Comment 10 Richard Biener 2006-11-14 17:42:01 UTC
It's true that the number of created calls is 2^N, but unfortunately the number
of created temporaries grows super-exponential:

 --param max-inline-recursive-depth-auto      grep 'struct X' t.C.t24.fixupcfg  | wc -l
   1       3
   2      63
   3   16383 (!)

So it grows like n_i = (2*(n_{i-1}+1))**2 - 1 with n_1 = 3.
For 4 we would have 1073741823, for 5 we get 4611686018427387904 number
of temporaries ;)

Honza's patch (comment #8) fixes this on the mainline, but I guess porting that
back is not really an option.  We might instead lower the default value of
max-inline-recursive-depth[-auto], which is currently 8.

From the above numbers a limit of 2 should be appropriate.  Or we can make
it count the number of functions inlined, not the depth, to avoid exponential
behavior with multiple calls to self.
Comment 11 Joseph S. Myers 2008-07-04 15:33:47 UTC
Closing 4.1 branch.