Bug 14179 - [8/9/10 Regression] out of memory while parsing array with many initializers
Summary: [8/9/10 Regression] out of memory while parsing array with many initializers
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 3.3.3
: P4 minor
Target Milestone: 8.4
Assignee: Not yet assigned to anyone
URL:
Keywords: memory-hog
: 36516 44066 (view as bug list)
Depends on: 12245 17596
Blocks: 47344
  Show dependency treegraph
 
Reported: 2004-02-17 17:10 UTC by Debora Estey
Modified: 2019-11-14 07:55 UTC (History)
14 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 4.0.4
Last reconfirmed: 2004-02-17 18:00:05


Attachments
file created with the -save-temps (33 bytes, text/plain)
2004-02-17 17:15 UTC, Debora Estey
Details
file created with -save-temps (128.27 KB, application/octet-stream)
2004-02-17 17:25 UTC, Debora Estey
Details
patch for precedence parsing (5.70 KB, patch)
2004-09-22 13:43 UTC, Paolo Bonzini
Details | Diff
Preliminar patch (557 bytes, patch)
2005-05-04 00:12 UTC, Giovanni Bajo
Details | Diff
Testcase with just the character array (76.08 KB, application/x-bzip)
2012-01-13 21:39 UTC, Jason Merrill
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Debora Estey 2004-02-17 17:10:45 UTC
We are trying to move from gcc 2.95 to 3.3.1. We are using cygwin and the code 
that will not compile is code in a library generated by another group. we must 
use it as is. The other group also uses the library but they use a greenhills 
compiler. The code will compile with gcc 2.95 but gcc 3.3.1 give us this error: 

cc1plus: out of memory allocating 65536 bytes after a total of 402620416 bytes

The gcc -v produces this information : 
Reading specs from /bin/../lib/gcc-lib/i686-pc-cygwin/3.3.1/specs
Configured with: /netrel/src/gcc-3.3.1-2/configure --enable-
languages=c,c++,f77,java --enable-libgcj --enable-threads=posix --with-system-
zlib --enable-nls --without-included-gettext --enable-interpreter --enable-sjlj-
exceptions --disable-version-specific-runtime-libs --enable-shared --build=i686-
pc-linux --host=i686-pc-cygwin --target=i686-pc-cygwin --prefix=/usr --exec-
prefix=/usr --sysconfdir=/etc --libdir=/usr/lib --
includedir=/nonexistent/include --libexecdir=/usr/sbin
Thread model: posix
gcc version 3.3.1 (cygming special)
Comment 1 Debora Estey 2004-02-17 17:15:31 UTC
Created attachment 5760 [details]
file created with the -save-temps

the command line that was run to produce this error was: 
g++ -I/usr/include/GL -c -I/RhapsodyCustomizations/Share/LangCpp
-fno-schedule-insns -fno-schedule-insns2
-/RapsodyCustomizations/Share/LangCpp/osconfig/Cygwin
-I/RhapsodyCustomizations/Share -I/RhapsodyCustomizations/Share/LangCpp/oxf
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0/uae//models/src/cds/include/
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0/uae//models/src/cds/include/mfdSym/
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0/uae//include
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0/universal/include
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0/universal/include/i386-pc-cygwin32
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0universal/include/i386-pc-cygwin32
-I/uaenfs_ASE/base/uae_dev/b2v6_1_0//universal/lib/src/libsdn/include
-I/usr/local/mysql/include/mysql -I/Exceed/xdk/motif21/include
-I/Exceed/xdk/include -I/usr/include/GL -I/Exceed/xdk/motif21/include
-DARCH_X86 -DOS_CYGWIN -DOS_VERSION=1322 -DWIN32 -DNT -DMOTIFAPP
-DBO_LITTLE_ENDIAN -DOM_USE_STL -DDBTYPE_MYSQL -DUSE_IOSTREAM -DARCH_X86
-DCYGWIN -DOS_VERSION=1322 -DWIN32 -DNT -DXMSTATIC -c ../SymSymbolGwb.cpp -o
SymSymbolGwb.o
Comment 2 Debora Estey 2004-02-17 17:25:46 UTC
Created attachment 5763 [details]
file created with -save-temps

file was to large so it was compressed using bzip2
Comment 3 Giovanni Bajo 2004-02-17 17:52:51 UTC
Debora,

Your problem is that G++ needs more RAM to be able to compile that file.

Lately, there has been a lot of work on making the C++ compiler consumes less 
and less memory, and on making on faster, but this is work cannot be backported 
into the 3.3 stable release serie. Currently G++ 3.4.0 compiles faster than 
2.95 on most tests, and consumes around the same amount of memory. 

I would suggest you to try again upgrading your compiler to 3.4.0. You will 
have to pull it from CVS and recompile it. Also, you may want to take a look at 
http://gcc.gnu.org/gcc-3.4/changes.html, since the new compiler is much more 
strict to the C++ standard, so old code might need to be modified. 

If this is feasable for you, I will be looking forward to a new report from you 
on how 3.4.0 behaves with your code. Alas, it's probably too late to improve 
the 3.3 serie so much. If you really need to use it, you can probably trying 
adding more RAM to your computer, it should probably allow it to finish 
compilation.
Comment 4 Andrew Pinski 2004-02-17 18:00:04 UTC
Confirmed., the problem is the large array for some reason GCC now likes to take huge amount 
memory for the array, mega_TextureSymbolData.
Comment 5 Andrew Pinski 2004-02-17 18:10:56 UTC
Related to bug 12245.
Comment 6 Mark Mitchell 2004-03-08 23:08:41 UTC
I've investigated this problem.

To some extent, this problem is inevitable.  In the old days, we used to output
assembly code for a global array element-by-element as we saw it.  That's not
what we want to do: we want to store up the array so that we can optimize loads
from fixed indices, etc.

However, we do waste a ton of memory.  We allocate a separate INTEGER_CST for
each occurrence of the same integer constant, even though there are only a few
of them.  We allocate an entirely new list of constants when we refactor the
brace-enclosed form (essentially adding braces that C++ says you can omit).  We
allocate an integer constant for every possible index into the array, which has
8 million entries.

We should be representing the initializer with a structure like this:

  struct init_group { 
    struct init_group *next;
    tree designator; /* The designator for the first element in the array.  */
    tree elts[];
  };

rather than a linked-list per element.  That would be far more efficient for
large arrays and a win even for small arrays.

None of this is going to get fixed until 3.5, however.
Comment 7 Andrew Pinski 2004-06-07 03:25:52 UTC
I will take a look at this next week because of things which need to improve with respect to compile 
time.
Comment 8 Giovanni Bajo 2004-09-15 14:26:55 UTC
Mark, one question: are you suggesting the special structure for the 
constructor only before reshape_init or also after it? Because we need to build 
a CONSTRUCTOR sooner or later for the gimplifier.

Anyway, would such a change acceptable for Stage 3, being a bugfix towards 
better memory allocation (and fixing this regression)?
Comment 9 Mark Mitchell 2004-09-15 16:29:55 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

giovannibajo at libero dot it wrote:

>------- Additional Comments From giovannibajo at libero dot it  2004-09-15 14:26 -------
>Mark, one question: are you suggesting the special structure for the 
>constructor only before reshape_init or also after it? Because we need to build 
>a CONSTRUCTOR sooner or later for the gimplifier.
>  
>
Both.  After reshape_init is even more critical because that memory 
cannot be collected.

>Anyway, would such a change acceptable for Stage 3, being a bugfix towards 
>better memory allocation (and fixing this regression)?
>
Perhaps, but it seems unlikely.  I've thought about it, but I'm scared 
of how much impact there would be through the compiler.  There are some 
smaller things that we could do more locally, though, like reusing the 
TREE_LISTs from before reshape_init after reshape_init.  Also, I suspect 
that Nathan's integer-sharing work has already reduced this problem 
somewhat; part of the problem was that we made multiple copies of every 
INTEGER_CST frpm zero up to the upper bound of the array.  Now, we 
should have only one at least.

Comment 10 Giovanni Bajo 2004-09-17 18:50:26 UTC
It looks like we do not destroy and recreate initializers in reshape_init, 
elements are moved from the old CONSTRUCTOR to the new one.

Instead, while investigating the code, I noticed this in reshape_init:

          /* Loop through the array elements, gathering initializers.  */
          for (index = size_zero_node;
               *initp && (!max_index || !tree_int_cst_lt (max_index, index));
               index = size_binop (PLUS_EXPR, index, size_one_node))
            {

We are constructing a *different* INTEGER_CST for each index, and we never use 
it. This generates a lot of garbage.

I do not know if it is enough to switch to HOST_WIDE_INT only, we may want to 
handle arrays larger than HWI (e.g. crosscompiling from 16bit to 32bit). My 
solution for mainline is to use HWI whenever possible, and falling back to 
trees when the indices get too high. Mark, does this make sense?

Dunno if this will be acceptable for 3.3 and 3.4 too, but let's have this fixed 
in mainline, as a start.
Comment 11 Giovanni Bajo 2004-09-18 13:10:33 UTC
Two patches posted, waiting for review:
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01839.html
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01840.html
Comment 12 CVS Commits 2004-09-20 23:05:53 UTC
Subject: Bug 14179

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	giovannibajo@gcc.gnu.org	2004-09-20 23:05:43

Modified files:
	gcc/cp         : ChangeLog decl.c 

Log message:
	PR c++/14179
	* decl.c (reshape_init): Extract array handling into...
	(reshape_init_array): New function. Use integers instead of trees
	for indices. Handle out-of-range designated initializers.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/ChangeLog.diff?cvsroot=gcc&r1=1.4366&r2=1.4367
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/decl.c.diff?cvsroot=gcc&r1=1.1296&r2=1.1297

Comment 13 CVS Commits 2004-09-21 21:12:54 UTC
Subject: Bug 14179

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_3-branch
Changes by:	giovannibajo@gcc.gnu.org	2004-09-21 21:12:51

Modified files:
	gcc/cp         : ChangeLog decl.c 

Log message:
	PR c++/14179
	* decl.c (reshape_init): Extract array handling into...
	(reshape_init_array): New function. Use integers instead of trees
	for indices.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.3076.2.274&r2=1.3076.2.275
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/decl.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.965.2.84&r2=1.965.2.85

Comment 14 CVS Commits 2004-09-21 22:50:03 UTC
Subject: Bug 14179

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	giovannibajo@gcc.gnu.org	2004-09-21 22:49:56

Modified files:
	gcc/cp         : ChangeLog decl.c 

Log message:
	PR c++/14179
	* decl.c (reshape_init): Extract array handling into...
	(reshape_init_array): New function. Use integers instead of trees
	for indices. Handle out-of-range designated initializers.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3892.2.158&r2=1.3892.2.159
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/decl.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.1174.2.23&r2=1.1174.2.24

Comment 15 CVS Commits 2004-09-21 23:46:11 UTC
Subject: Bug 14179

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	giovannibajo@gcc.gnu.org	2004-09-21 23:46:08

Modified files:
	gcc/cp         : ChangeLog parser.c 

Log message:
	PR c++/14179
	* parser.c (cp_parser_initializer): Speed up parsing of simple
	literals as initializers.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3892.2.159&r2=1.3892.2.160
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cp/parser.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.157.2.40&r2=1.157.2.41

Comment 16 Giovanni Bajo 2004-09-22 00:36:30 UTC
OK, let me update the situation. I'm using a testcase which has an array of 4 
millions of initializers (about 1/4th of the original testcase). These are the 
three patches I'm considering important for this patch till now:

(1) Nathan's INTEGER_CST sharing
(2) my patch to reshape_init_array using HOST_WIDE_INTEGER
(3) my patch to speedup initializer list parsing

(1) and (2) affects memory occupation, (3) affects compilation time. This is 
the situations for our supported compilers:

GCC-3.3.5
---------
(1) not present
(2) applied
(3) not required (old parser)

Testcase: killed after 0m31s (420MB is the kill threshold).



GCC-3.4.3
---------
(1) not present
(2) applied
(3) applied

Testcase: killed after 0m23s (420MB is the kill threshold).


GCC-4.0.0
---------
(1) applied
(2) applied
(3) will be applied (or rewritten in a better form).

Testcase: 0m26s, 220MB
(this is with (3) applied, and checking disabled)


For comparison, GCC-2.95: 0m5s, 175MB.
Comment 17 Giovanni Bajo 2004-09-22 00:38:54 UTC
So, situation is getting better, but we are not there yet. My next target is 
process_init_constructor, where we are being very dumb.
Comment 18 Mark Mitchell 2004-09-22 00:54:37 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

giovannibajo at libero dot it wrote:

>Testcase: 0m26s, 220MB
>(this is with (3) applied, and checking disabled)
>
>
>For comparison, GCC-2.95: 0m5s, 175MB.
>  
>
For the record, I don't expect we will get all the way back there.  I 
consider it a design feature that we are keeping the initializer around 
in the compiler: that allows us to (in theory) pull elements out of it 
if they are referenced.  With that capabitility comes a cost.

That is not to say that we should not keep going with your work of course!

FYI, Jeffrey Oldham is working on operator-precedence parsing; hopefully 
he'll have a patch soon.

Comment 19 Paolo Bonzini 2004-09-22 13:43:41 UTC
Created attachment 7196 [details]
patch for precedence parsing

Here is a patch for precedence parsing, being regtested currently.
Comment 20 Paolo Bonzini 2004-09-22 13:51:47 UTC
Actually precedence parsing is not a big job, incrementally what I did for my
patch was just a matter of
1) making a single map of all the productions instead of the per-function maps
we have currently
2) making cp_parser_binary_expression use it, still keeping the recursive call
and passing around the precedence we are interested in
3) turn recursion into an explicit stack
4) optimize to eliminate unneeded (simulated) recursion

It would be nice if the wiki were used to track in-house CodeSourcery projects.
 It looks like Jeffrey and I duplicated work, and we almost did the same on the
tree class codes projects (and I'm not mocking CodeSourcery's work, I'm just
reading the wiki and using common sense on where to optimize the compiler).

Paolo
Comment 21 Mark Mitchell 2004-09-22 15:33:27 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

bonzini at gnu dot org wrote:

>------- Additional Comments From bonzini at gnu dot org  2004-09-22 13:51 -------
>Actually precedence parsing is not a big job, incrementally what I did for my
>patch was just a matter of
>1) making a single map of all the productions instead of the per-function maps
>we have currently
>2) making cp_parser_binary_expression use it, still keeping the recursive call
>and passing around the precedence we are interested in
>3) turn recursion into an explicit stack
>4) optimize to eliminate unneeded (simulated) recursion
>
>It would be nice if the wiki were used to track in-house CodeSourcery projects.
> It looks like Jeffrey and I duplicated work, and we almost did the same on the
>tree class codes projects (and I'm not mocking CodeSourcery's work, I'm just
>reading the wiki and using common sense on where to optimize the compiler).
>  
>
I'm sorry that happened, but we did mention the operator-precedence 
parsing and tree-code conversion work publicly as we started on it.  
We'll try to be louder.  I am certainly no happier than you that we are 
duplicating each other's work!

Comment 22 Mark Mitchell 2004-09-22 15:39:05 UTC
Paolo --

This patch is great.  My only comment is that I would like the grammar entries
that used to be above the various functions (pm-expression: ..., etc.) to be
preserved above the rewritten binary_expression function.  Please check in!

Thanks,

-- Mark
Comment 23 paolo.bonzini@polimi.it 2004-09-22 15:43:48 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

> I'm sorry that happened, but we did mention the operator-precedence 
> parsing and tree-code conversion work publicly as we started on it.  

Though I only knew of tree-code conversion from private mail by Zack 
(and you said you were about to finish it when I said I'd do that on 
faster-compiler-branch); and the operator-precedence was placed on the 
wiki a week ago as a generic "speedup area", not a project that is 
actively worked on unlike Matt's lexer overhaul:

http://www.dberlin.org/gccwiki/index.php/Speedup%20areas

Since that page also mentions Nathan's "Add contains-repeated-base and 
is diamond-shaped flags to classes" project, maybe it is *that* page 
that needs to be louder.

 > We'll try to be louder.

No problem.  You're doing a good work.

Paolo


Comment 24 Mark Mitchell 2004-09-22 15:54:10 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

paolo dot bonzini at polimi dot it wrote:

> and the operator-precedence was placed on the 
>wiki a week ago as a generic "speedup area", not a project that is 
>actively worked on unlike Matt's lexer overhaul:
>  
>
Honestly, we didn't know we'd be working on until the early part of this 
week.  We didn't decided until Monday around noon in California.  But, 
the fundamental point is that it's in nobody's best interest to 
duplicate work, so we will try to make as much noise as possible about  
projects.

>http://www.dberlin.org/gccwiki/index.php/Speedup%20areas
>
>Since that page also mentions Nathan's "Add contains-repeated-base and 
>is diamond-shaped flags to classes" project, maybe it is *that* page 
>that needs to be louder.
>  
>
Just for the record, I do not know if Nathan is presently working on 
that or not.  I think he may have concluded there is not enough win there.

> > We'll try to be louder.
>
>No problem.  You're doing a good work.
>  
>
Thanks for your understanding -- and you, likewise, are doing good work!

After you get timing numbers for 14179, it would be interesting to 
consider whether or not we should try to extend the operator-precedence 
parser, or do other short-circuiting tricks, when getting down to the 
bottom of binary expressions.  For example, if a unary expression is an 
integer literal or identifier followed by ")" or "," or ";" we know that 
it's just a primary expression.  In other words, we could use two tokens 
of lookahead to zip straight to cp_parser_primary_expression.  Would you 
like to take a look at that as well?

Comment 25 paolo.bonzini@polimi.it 2004-09-22 15:56:43 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

> In other words, we could use two tokens 
> of lookahead to zip straight to cp_parser_primary_expression.  Would you 
> like to take a look at that as well?

I had mailed a prototype of the patch to Giovanni as well, hoping that 
he got to that before me.  We'll see after the regtest has concluded, 
though.

Paolo

Comment 26 Giovanni Bajo 2004-09-23 00:01:50 UTC
Mark: process_init_constructor builds new TREE_LISTs for each new initializer. 
This is pretty easy to get rid of, at least for arrays, and will be taken of 
with a patch I will be testing soon. The mainline version will likely extract 
and cleanup array handling into a separate function.

But process_init_constructor also calls digest_init for each and every 
initializer, which makes the initializer goes through the conversion machinery. 
For the example in this PR, we build an IDENTITY_CONV and a couple of 
STANDARD_CONV for each initializer.

For mainline, I could try to modify the conversion engine to not use trees to 
keep track of conversions. I think a specific struct kept in a local Vec would 
be good enough.

For 3.4/3.3, is there a way to avoid calling digest_init if we detect that we 
can just fold_convert (or similar) the initializer? Or maybe put such a speedup 
check within digest_init directly? I am thinking of simple default promotions, 
for which building 3-4 trees and throwing them away doesn't look too smart. I 
am not expert in this kind of type conversions stuff, so I can't devise what a 
correct check for this would be, without making it too specific for the case in 
this PR. Can you suggest me something to get me started?
Comment 27 Mark Mitchell 2004-09-23 00:16:29 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

giovannibajo at libero dot it wrote:

>------- Additional Comments From giovannibajo at libero dot it  2004-09-23 00:01 -------
>Mark: process_init_constructor builds new TREE_LISTs for each new initializer. 
>This is pretty easy to get rid of, at least for arrays, and will be taken of 
>with a patch I will be testing soon. The mainline version will likely extract 
>and cleanup array handling into a separate function.
>
>But process_init_constructor also calls digest_init for each and every 
>initializer, which makes the initializer goes through the conversion machinery. 
>For the example in this PR, we build an IDENTITY_CONV and a couple of 
>STANDARD_CONV for each initializer.
>  
>
On the mainline, this should be much cheaper because we do not build any 
trees for conversions.  We build "struct conversion" instead, and those 
are allocated on an obstack.  So, you should confirm that this is still 
a bottleneck on the mainline.

>For 3.4/3.3, is there a way to avoid calling digest_init if we detect that we 
>can just fold_convert (or similar) the initializer? Or maybe put such a speedup 
>check within digest_init directly? I am thinking of simple default promotions, 
>for which building 3-4 trees and throwing them away doesn't look too smart. I 
>am not expert in this kind of type conversions stuff, so I can't devise what a 
>correct check for this would be, without making it too specific for the case in 
>this PR. Can you suggest me something to get me started?
>  
>
I think it would be better to try to do this in a way that could be used 
on the mainline too.  If conversions are still a bottleneck,  then we 
could try to optimize.  The most common case is probably that the "from" 
and "to" types are the same.  So, you could try having 
implicit_conversion do "if same_type_p (to, from) && !class_type return 
identity conversion".  (Might even be better just to check pointer 
equality of "to" and "from", so as to avoid the cost of same_type_p if 
they are *not* the same.)  That would short-circuit a lot of the work, 
and might win for other test cases as well, because you save not only on 
digest_init, but with function calls like:

  void f(int);
  void g() { f(3); }

Comment 28 Giovanni Bajo 2004-09-23 01:00:12 UTC
(In reply to comment #27)

> On the mainline, this should be much cheaper because we do not build any 
> trees for conversions.  We build "struct conversion" instead, and those 
> are allocated on an obstack.  So, you should confirm that this is still 
> a bottleneck on the mainline.

Ah right. I did forget that this cleanup was already done. I can confirm this 
is not a bottleneck on the mainline anymore.

BTW, preliminar testing of my patch to process_init_constructor is *very* 
promising: on the mainline, compared to comment #16, we now save an additional 
100MB of RAM. We can compile the quarter of the testcase with 120MB of RAM (and 
GCC 2.95 uses 175MB)!



>>For 3.4/3.3, is there a way to avoid calling digest_init if we detect that we 
>>can just fold_convert (or similar) the initializer? 

> I think it would be better to try to do this in a way that could be used 
> on the mainline too.  If conversions are still a bottleneck,  then we 
> could try to optimize.  

It turned out I was wrong, and we don't need to do this on mainline.

> The most common case is probably that the "from" 
> and "to" types are the same.  So, you could try having 
> implicit_conversion do "if same_type_p (to, from) && !class_type return 
> identity conversion".  (Might even be better just to check pointer 
> equality of "to" and "from", so as to avoid the cost of same_type_p if 
> they are *not* the same.)  That would short-circuit a lot of the work, 
> and might win for other test cases as well, because you save not only on 
> digest_init, but with function calls like:
>   void f(int);
>   void g() { f(3); }

Yes, but the problem is that also default promotions are very common:

void f(char);
void g() { f(3); }

and this is what we need to short-circuit for the testcase to start saving 
memory. I tried something like:

  if (INTEGRAL_TYPE_P (to) && INTEGRAL_TYPE_P (from)
      && same_type_p (type_promotes_to (to), type_promotes_to (from)))
      return ocp_convert (to, expr, CONV_IMPLICIT, flags);

but I'm not sure about those type_promotes_to, plus it segfaults for some 
reason I'm investigating...

Comment 29 Mark Mitchell 2004-09-23 01:37:00 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

giovannibajo at libero dot it wrote:

>>The most common case is probably that the "from" 
>>and "to" types are the same.  So, you could try having 
>>implicit_conversion do "if same_type_p (to, from) && !class_type return 
>>identity conversion".  (Might even be better just to check pointer 
>>equality of "to" and "from", so as to avoid the cost of same_type_p if 
>>they are *not* the same.)  That would short-circuit a lot of the work, 
>>and might win for other test cases as well, because you save not only on 
>>digest_init, but with function calls like:
>>  void f(int);
>>  void g() { f(3); }
>>    
>>
>
>Yes, but the problem is that also default promotions are very common:
>
>void f(char);
>void g() { f(3); }
>
>and this is what we need to short-circuit for the testcase to start saving 
>memory. I tried something like:
>
>  if (INTEGRAL_TYPE_P (to) && INTEGRAL_TYPE_P (from)
>      && same_type_p (type_promotes_to (to), type_promotes_to (from)))
>      return ocp_convert (to, expr, CONV_IMPLICIT, flags);
>
>but I'm not sure about those type_promotes_to, plus it segfaults for some 
>reason I'm investigating...
>  
>
I don't know about the segfault, but I'd worry that you might not win 
much once the tests get that complex, at least for code other than this 
one test case.  Giant arrays with huge initializers are not the typical 
case, thankfully.

Comment 30 Paolo Bonzini 2004-09-24 16:05:47 UTC
Great news.  Thanks to fixing PR17596, we now outperform 3.3.4 by 25% for a
reduced testcase (with an array of 240000 elements).  Every additional element
costs us a mere 44 instructions in cp_parser_binary_expression according to
cachegrind.

I'm not closing this because Giovanni's patch to lookahead for a comma may still
make some difference, but I'm degrading its priority.

Paolo
Comment 31 Mark Mitchell 2004-09-24 16:53:38 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

bonzini at gcc dot gnu dot org wrote:

>------- Additional Comments From bonzini at gcc dot gnu dot org  2004-09-24 16:05 -------
>Great news.  Thanks to fixing PR17596, we now outperform 3.3.4 by 25% for a
>reduced testcase (with an array of 240000 elements).  Every additional element
>costs us a mere 44 instructions in cp_parser_binary_expression according to
>cachegrind.
>  
>
That is fabulous news!

>I'm not closing this because Giovanni's patch to lookahead for a comma may still
>make some difference, but I'm degrading its priority.
>
Please also remove any regression tags, and remove any release target 
markings.  (This is now an opportunity for improvement, not a regresison.)

Comment 32 Paolo Bonzini 2004-09-24 18:53:29 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory

 >>Every additional element costs us a mere 44 instructions
>>in cp_parser_binary_expression according to cachegrind.
> 
> That is fabulous news!

(Of course it does not count instructions elsewhere).

> Please also remove any regression tags, and remove any release target 
> markings.  (This is now an opportunity for improvement, not a regresison.)

Done.

Comment 33 Giovanni Bajo 2004-09-25 00:50:28 UTC
No, sorry, this is wrong. This bug still shows a big memory regression. As I am 
explaining in comment #26 and comment #28, I am working on a patch to 
process_init_constructor to fix it, but we are not there yet.

(when I said "I confirm this is not a bottleneck on mainline anymore" I meant 
that the standard conversion code was not eating too much memory -- the 
testcase still cannot be compiled on my computer because of useless TREE_LISTs 
we build in process_init_constructor).
Comment 34 Debora Estey 2004-10-26 16:11:25 UTC
Now that this is an enhancement is there any chance of getting it fixed
Comment 35 Andrew Pinski 2004-10-26 16:14:32 UTC
Yes because this is still a regression.  Note the mainline is already better than what 3.3.3 was in terms 
of memory usage.
Comment 36 Giovanni Bajo 2004-10-27 15:43:50 UTC
Debora, I'm working on a patch which should definitely fix this bug. I hope to 
be able to finish it before 4.0 gets out and/or 3.3 is definitely closed.
Comment 37 Steven Bosscher 2004-12-23 12:35:29 UTC
Giovanni, any news?
Comment 38 Giovanni Bajo 2004-12-24 02:23:53 UTC
Subject: Re:  [3.3/3.4/4.0 Regression] out of memory while parsing array with many initializers

steven at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote:

> Giovanni, any news?

I have a patch around for a long time already, but I cannot find the time to do
much GCC work right now (as you may have noticed). I would still like to tackle
this bug unless somebody is in a hurry, so I guess you'll have to wait a little
bit more.

Giovanni Bajo

Comment 39 Steven Bosscher 2004-12-24 07:07:54 UTC
Everyone is in a hurry, this is a regression ;-) 
Can you attach the patch so someone can have a look and maybe 
finish it for you? 
 
Comment 40 Debora Estey 2005-05-03 22:44:32 UTC
any word on this?
Comment 41 Giovanni Bajo 2005-05-04 00:12:47 UTC
Created attachment 8814 [details]
Preliminar patch

Sorry about this.
This is a preliminar patch which I did months ago and never got around testing
and posting it. This greatly reduces memory occupation for the testcase in this
PR, and basically fixes the only remaining memory hog.
Comment 42 Andrew Pinski 2005-07-25 01:34:29 UTC
Using the testcase from PR 12245:
cp/parser.c:285 (cp_lexer_new_main)                       0: 0.0%  372302336:88.5%          0: 0.0%  
104391168:79.7%          9
cp/parser.c:270 (cp_lexer_new_main)                       0: 0.0%     364288: 0.1%          0: 0.0%     102144: 
0.1%          1
cp/parser.c:12407 (cp_parser_initializer_list)     22770552:24.8%   23955160: 5.7%          0: 0.0%   
13171408:10.1%         19
cp/decl.c:4182 (reshape_init_array_1)                     0: 0.0%   23955160: 5.7%   22770552:24.6%   
13171408:10.1%         19
ggc-common.c:193 (ggc_calloc)                      16770368:18.3%          0: 0.0%   16805272:18.1%        
612: 0.0%         51
tree.c:828 (build_int_cst_wide)                         480: 0.0%          0: 0.0%   52180672:56.3%          0: 0.0%    
1630661
convert.c:671 (convert_to_integer)                 52187584:56.8%          0: 0.0%          0: 0.0%          0: 0.0%    
1630862


This is worse than the C front-end.
Comment 43 Ian Lance Taylor 2005-10-12 05:44:57 UTC
I see that Giovanni checked in a significant patch here:

2005-07-20  Giovanni Bajo  <giovannibajo@libero.it>

	Make CONSTRUCTOR use VEC to store initializers.

Is this PR still a significant regression from an earlier release?
Comment 44 Mark Mitchell 2005-10-30 22:13:32 UTC
We don't have clear evidence that this is worse, let alone substantially worse, than previous releases.  Until and unless we do, I've downgraded this to P4. However, if this is fixed, let's just mark it so, and move on.  In fact, if it's even in the ballpark, let's mark it fixed and move on.  It's usually not very useful to chase the last few percent on a compile-time/memory testcase, as there are other places where we know we can have bigger impact.
Comment 45 Paolo Bonzini 2006-07-28 12:20:05 UTC
Comment on attachment 8814 [details]
Preliminar patch

The patch is not relevant anymore after the commit that Ian pointed out.
Comment 46 Gabriel Dos Reis 2007-01-18 02:51:02 UTC
Won't fix for 4.0.x
Comment 47 Andrew Pinski 2008-06-13 04:57:32 UTC
*** Bug 36516 has been marked as a duplicate of this bug. ***
Comment 48 Joseph S. Myers 2008-07-04 16:30:04 UTC
Closing 4.1 branch.
Comment 49 Jan Hubicka 2009-02-22 13:23:06 UTC
Similarly as in PR c/12245 we build a tons of unnecesary CONVERT_EXPRs.  Avoiding this by same patch as attached to PR c/12245 brings garbage donwn
by 54% from:

cp/lex.c:511 (build_lang_decl)                        94176: 0.0%     116432: 0.0%     826264: 0.1%      98952: 0.0%       4247
toplev.c:1538 (realloc_for_line_map)                      0: 0.0%    1310720: 0.1%    1316864: 0.1%     555008: 0.1%          7
ggc-common.c:187 (ggc_calloc)                     134478488:12.0%     188112: 0.0%  134356240:14.6%      18504: 0.0%       2913
cp/decl.c:4683 (reshape_init_array_1)                     0: 0.0%  374786120:20.5%  373090856:40.6%  211006352:43.0%         22
cp/parser.c:14709 (cp_parser_initializer_list)    373090856:33.3%  374786120:20.5%         88: 0.0%  211006360:43.0%         23
tree.c:1004 (build_int_cst_wide)                       8640: 0.0%          0: 0.0%  402626592:43.8%          0: 0.0%    8388234
convert.c:752 (convert_to_integer)                603950328:54.0%          0: 0.0%          0: 0.0%   67105592:13.7%    8388199
Total                                            1118959027       1826757514        919671455        491189724         16977745
source location                                     Garbage            Freed             Leak         Overhead            Times

to:

cp/lex.c:511 (build_lang_decl)                        94176: 0.0%     116432: 0.0%     826264: 0.1%      98952: 0.0%       4247
toplev.c:1538 (realloc_for_line_map)                      0: 0.0%    1310720: 0.1%    1316864: 0.1%     555008: 0.1%          7
ggc-common.c:187 (ggc_calloc)                     134478488:26.1%     188112: 0.0%  134356240:14.6%      18504: 0.0%       2913
cp/decl.c:4683 (reshape_init_array_1)                     0: 0.0%  374786120:20.5%  373090856:40.6%  211006352:49.8%         22
cp/parser.c:14709 (cp_parser_initializer_list)    373090856:72.4%  374786120:20.5%         88: 0.0%  211006360:49.8%         23
tree.c:1004 (build_int_cst_wide)                       8640: 0.0%          0: 0.0%  402626592:43.8%          0: 0.0%    8388234
Total                                             515008771       1826757514        919671455        424084140          8589547
source location                                     Garbage            Freed             Leak         Overhead            Times

so saving about 0.5GB of RAM and speeding up correspondingly too.  We can still improve but this seems low hanging fruit.

Honza
Comment 50 Steven Bosscher 2009-02-22 13:39:17 UTC
Honza, you realize that the numbers are completely unreadable in bugzilla, right?
Comment 51 Jan Hubicka 2009-02-22 14:47:07 UTC
Subject: Re:  [4.2/4.3/4.4 Regression] out of memory while parsing array with many initializers

> Honza, you realize that the numbers are completely unreadable in bugzilla,
> right?
THey need some care to read, the columns are still intact, just
interleaved... I wonder why bugzilla insists on the linebreaks?

Honza
Comment 52 Jan Hubicka 2009-02-23 16:06:29 UTC
With patches proposed for c/12245 we now need 377MB (from original over 1GB) garbage and produce 920MB of IL.
Pretty much all the garbage is coming from temporary list constructed here:

      /* Add it to the vector.  */
      CONSTRUCTOR_APPEND_ELT(v, identifier, initializer);

in cp_parser_initializer_list.
Perhaps explicitly freeing would be good idea? 

Honza
Comment 53 Mark Mitchell 2009-02-23 16:11:49 UTC
Subject: Re:  [4.2/4.3/4.4 Regression] out of memory while
 parsing array with many initializers

hubicka at gcc dot gnu dot org wrote:

> Perhaps explicitly freeing would be good idea? 

I certainly have no objection to explicitly freeing storage if we know
we don't need it anymore.

Comment 54 Jan Hubicka 2009-02-23 16:51:38 UTC
Subject: Re:  [4.2/4.3/4.4 Regression] out of memory while parsing array with many initializers

> > Perhaps explicitly freeing would be good idea? 
> 
> I certainly have no objection to explicitly freeing storage if we know
> we don't need it anymore.

Problem is that I don't know enough of C++ parser to be sure where we
can safely free this vector?

Honza
Comment 55 Joseph S. Myers 2009-03-31 16:14:01 UTC
Closing 4.2 branch.
Comment 56 Richard Biener 2009-08-04 12:25:57 UTC
GCC 4.3.4 is being released, adjusting target milestone.
Comment 57 Andrew Pinski 2010-05-10 22:06:36 UTC
*** Bug 44066 has been marked as a duplicate of this bug. ***
Comment 58 Richard Biener 2010-05-22 18:09:58 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 59 Richard Biener 2011-06-27 12:13:14 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 60 Jason Merrill 2012-01-13 20:27:06 UTC
Giovanni hasn't touched this bug since 2004, so I'm unassigning him.  It seems to me that the best way to avoid the garbage from cp_parser_initializer_list would be to rewrite reshape_init to avoid copying initializer lists that need no reshaping.  But that isn't a project for stage 4.
Comment 61 Jason Merrill 2012-01-13 21:39:10 UTC
Created attachment 26317 [details]
Testcase with just the character array

I couldn't compile the original testcase with 2.95, so I've stripped out the important part, which is just a massive char array.

For reference, compiling this on my Core i7 laptop (time and VM usage):

2.95   6s  717M
3.0   16s 1764M
3.2   17s 1813M
3.3   20s 2028M
3.4   15s 1803M
4.0   18s 1900M
4.1    9s 1635M
4.2   10s 1636M
4.3    9s 1158M
4.4   xxx 1161M (no time; non-optimized build)
4.5   11s 1097M
4.6   xxx 1258M (ditto)
4.7   14s 1704M (r183161, optimized, --enable-checking=release)

So there was certainly a big jump from 2.95 to 3.0.  4.3 improved memory use quite a bit, but now it's gone up again.
Comment 62 Jason Merrill 2012-01-13 22:08:38 UTC
(In reply to comment #61)
> 4.7   14s 1704M (r183161, optimized, --enable-checking=release)

Making the change to convert_to_integer mentioned in 12245 reduces VM size to 1509M; there's another 190M of garbage from cp_parser_initializer_list, but that still doesn't account for all the increase in VM size.
Comment 63 Jan Hubicka 2012-01-14 14:18:27 UTC
> Making the change to convert_to_integer mentioned in 12245 reduces VM size to
> 1509M; there's another 190M of garbage from cp_parser_initializer_list, but
> that still doesn't account for all the increase in VM size.
--enable-gather-detailed-mem-stats dump should pinpoint this quite easilly...

Honza
Comment 64 Jason Merrill 2012-01-14 17:06:46 UTC
Yep, it turned out that there was a lot of allocation overhead from vector allocation in the token buffer.  After fixing that as well with the patch at

http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00732.html

this testcase is down to 967MB VM size.  The only obvious area of improvement left is the 67MB of garbage from unnecessary reshape_init copying, which seems like more work than it's worth for this testcase, and definitely not something for 4.7.
Comment 65 Jason Merrill 2012-01-16 16:40:36 UTC
Author: jason
Date: Mon Jan 16 16:40:26 2012
New Revision: 183213

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183213
Log:
	PR c++/14179
	* vec.c (vec_gc_o_reserve_1): Use ggc_round_alloc_size.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/vec.c
Comment 66 Jason Merrill 2012-01-16 16:40:49 UTC
Author: jason
Date: Mon Jan 16 16:40:38 2012
New Revision: 183214

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183214
Log:
	PR c/12245
	PR c++/14179
	* convert.c (convert_to_integer): Use fold_convert for
	converting an INTEGER_CST to integer type.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/convert.c
Comment 67 Jakub Jelinek 2012-03-13 12:46:06 UTC
4.4 branch is being closed, moving to 4.5.4 target.
Comment 68 Jakub Jelinek 2013-04-12 15:16:18 UTC
GCC 4.6.4 has been released and the branch has been closed.
Comment 69 Richard Biener 2014-06-12 13:43:15 UTC
The 4.7 branch is being closed, moving target milestone to 4.8.4.
Comment 70 Jakub Jelinek 2014-12-19 13:33:18 UTC
GCC 4.8.4 has been released.
Comment 71 Richard Biener 2015-06-23 08:15:47 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 72 Jakub Jelinek 2015-06-26 19:57:27 UTC
GCC 4.9.3 has been released.
Comment 73 Richard Biener 2016-08-03 08:37:27 UTC
GCC 4.9 branch is being closed
Comment 74 Richard Biener 2017-02-02 08:56:22 UTC
Author: rguenth
Date: Thu Feb  2 08:55:44 2017
New Revision: 245118

URL: https://gcc.gnu.org/viewcvs?rev=245118&root=gcc&view=rev
Log:
2017-02-02  Richard Biener  <rguenther@suse.de>

	PR cp/14179
	* cp-gimplify.c (cp_fold): When folding a CONSTRUCTOR copy
	it lazily on the first changed element only and copy it
	fully upfront, only storing changed elements.

Modified:
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/cp-gimplify.c
Comment 75 Richard Biener 2017-02-02 09:20:09 UTC
For C++ another inefficiency is that we call (for the testcase from PR12245)
1630776 times cxx_eval_outermost_constant_expr which always allocates a hash-map.
All but one of the calls are with t == INTEGER_CST.  Called via maybe_constant_init which has the same issue as maybe_constant_value
(see thread starting at https://gcc.gnu.org/ml/gcc-patches/2017-02/msg00046.html).
Comment 76 Richard Biener 2017-02-07 12:44:36 UTC
Top VM usage update:

4.7.2   14s    1660M  (-O0)
7.0.1   20s    1100M  (-O0 -fno-checking but checking enabled)
Comment 77 Richard Biener 2017-02-07 12:49:01 UTC
So the "low hanging fruit" remaining is reshape_init_array copying the whole array even if not necessary.

INTEGER_CSTs still account for most of the memory use (200MB) apart from C++
preprocessor tokens (530MB) and the actual array of tree pointers for the
constructors (2x 130MB at peak).
Comment 78 Jakub Jelinek 2018-10-26 10:20:07 UTC
GCC 6 branch is being closed
Comment 79 Richard Biener 2019-11-14 07:55:52 UTC
The GCC 7 branch is being closed, re-targeting to GCC 8.4.