Bug 12245 - [12/13/14/15 regression] Uses lots of memory when compiling large initialized arrays
Summary: [12/13/14/15 regression] Uses lots of memory when compiling large initialized...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 3.3.1
: P4 normal
Target Milestone: 12.5
Assignee: Not yet assigned to anyone
URL:
Keywords: memory-hog
: 39142 (view as bug list)
Depends on:
Blocks: 14179 47344 79266
  Show dependency treegraph
 
Reported: 2003-09-11 06:56 UTC by Rajstennaj Barrabas
Modified: 2024-12-18 18:41 UTC (History)
17 users (show)

See Also:
Host:
Target: i686-pc-linux-gnu
Build:
Known to work: 2.95.3
Known to fail: 3.3.3, 3.4.3, 4.0.0, 4.0.4, 4.3.0
Last reconfirmed: 2008-01-05 14:07:50


Attachments
Simple test case (215 bytes, application/x-gzip)
2004-12-02 15:31 UTC, Ian Lance Taylor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rajstennaj Barrabas 2003-09-11 06:56:16 UTC
This is a simple program with large initialized static arrays (20 arrays of
200,000 elements each, plus a single array with 500,000 elements.

Compiling crashes the compiler after about an hour (900 MHZ system).
Reproduce by typing "Make" (Makefile included).
Actual results are listed below.
Compiling with -Wall shows no warnings.

I really hate to do this to you, but I believe that the problem is related to
initializing very large arrays. Consequently, I cannot generate a small program
which illustrates the problem (and it takes an hour for each experiment, which
doesn't help either). My minimal solution is a C source file of 152 lines
(inconsequential), and two include files of 200,000 lines and 500,000 lines
each. The include files contain initialization data for 21 arrays, and are
nothing more than a list of numbers and commas.

The total source and intermediate files are so BIG that I have not included them
here. I know I'm not supposed to send archives and I'm *really* not supposed to
ask you to download the test cases from the net, but in this instance I think
it's appropriate.

You can get the complete test set (1 source, 2 includes, Makefile, and saved
intermediate file) at www.OkianWarrior.com/gccBug.tar.gz

/home/kibaro/tmp: make
gcc -v -save-temps -o CSolv CSolv.c
Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/specs
Configured with: ./configure
Thread model: posix
gcc version 3.3.1
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -E -quiet -v -D__GNUC__=3
-D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=1 CSolv.c CSolv.i
ignoring nonexistent directory "NONE/include"
ignoring nonexistent directory "/usr/local/i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/include
 /usr/include
End of search list.
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -fpreprocessed CSolv.i
-quiet -dumpbase CSolv.c -auxbase CSolv -version -o CSolv.s
GNU C version 3.3.1 (i686-pc-linux-gnu)
        compiled by GNU C version 3.2 (Mandrake Linux 9.0 3.2-1mdk).
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32119
gcc: Internal error: Killed (program cc1)
Please submit a full bug report.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
make: *** [CSolv] Error 1
Comment 1 Andrew Pinski 2003-09-11 15:53:49 UTC
Please attach the preprocessed files to this bug.
Comment 2 barrabas@barrabas.mv.com 2003-09-11 19:26:06 UTC
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message 3 of 9)

> Please attach the preprocessed files to this bug.

	The file is too big to send to Bugzilla. As mentioned in the bug posting, the 
file is available for http download here:

www.OkianWarrior.com/gccBug.tar.gz

	I've read the guidelines for posting bugs. I know I'm not supposed to post 
archives, or links to archives, or multiple file examples. I know all this. I 
believe that this is an exception for reasons stated in the bug description, 
and I ask that you bear with me.

																R. Barrabas

==================================================

My younger brother asked me what happens after we die. I told
  him we get buried under a bunch of dirt and worms eat our bodies.
  I guess I should have told him the truth-that most of us go to hell
  and burn eternally - but I didn't want to upset him.


Comment 3 barrabas@barrabas.mv.com 2003-09-11 19:28:39 UTC
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message 3 of 9)

> ------- Additional Comments From pinskia at gcc dot gnu dot org  2003-09-11 15:53 -------
> Please attach the preprocessed files to this bug.

	I would guess that the bug is caused by:

	1) The compiler allocates lots of storage for intermediate results.
	2) Virtual memory gets used up, and the next allocations fails.
	3) The allocation is not checked, leading to eventual failure.

																R. Barrabas

==================================================

Dictatorship (n): a form of government under which everything
                  which is not prohibited is compulsory.


Comment 4 Andrew Pinski 2003-09-17 04:07:34 UTC
How much memory do you have?
Comment 5 barrabas@barrabas.mv.com 2003-09-17 04:33:10 UTC
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message 7 of 9)

> How much memory do you have?

256MB of ram + 256MB of swap.


==================================================

Those who live by the sword get shot by those who don't.


Comment 6 Andrew Pinski 2003-09-17 05:21:11 UTC
Must be a memory intensive as I cannot reproduce on a system with 1GB of memory.
Comment 7 Andrew Pinski 2003-10-01 03:33:39 UTC
Not GCC problem that the OS returns a non-zero pointer when memory is full.
Comment 8 Andrew Pinski 2003-10-01 03:35:03 UTC
Though on the other hand GCC should not be such a hog of memory.
Comment 9 Andrew Pinski 2003-10-05 06:32:37 UTC
Note that one day the web server will be down and "we" (meaning GCC developers) 
cannot access the testcase so we will ask you for the testcase to be attached, can you 
just attach the preprocessed source.
Comment 10 Andrew Pinski 2003-11-22 20:39:52 UTC
The testcase takes about 445M on i686-pc-linux-gnu and more than 500M on powerpc-
apple-darwin7.0.0.  Will attach testcase.
Comment 11 Andrew Pinski 2004-01-19 12:57:35 UTC
Still is a problem on the mainline, targeting 3.5.0 for now.
Comment 12 Falk Hueffner 2004-09-12 15:36:00 UTC
The URL of the test case doesn't seem to work anymore. Does anybody still have
the test case?
Comment 13 Ian Lance Taylor 2004-12-02 15:31:30 UTC
Created attachment 7660 [details]
Simple test case
Comment 14 Andrew Pinski 2004-12-02 15:42:48 UTC
We take with the attached testcase about 300M with the C front-end but a huge amount more for the 
C++ front-end, why?
Comment 15 Ian Lance Taylor 2004-12-02 15:53:42 UTC
I attached a simple test case.  This is based on real existing code, although I
changed all the values to hide potentially proprietary information.  When I
compile this file without optimization, it uses some 200M, and garbage collects
while compiling this file.  The compilation takes 1 minutes, 45 seconds.  (This
is much better than gcc 3.4.3, actually, which used all available memory,
garbage collected twice, and wound up swapping for 10 minutes or so before
completing).

When compiling with 2.95.3, the compiler uses 20M and completes in 37 seconds.

The compiler used to work fine when processing very large initializers.  As it
read the initializer, gcc would output the initializer to the assembler file
directly.  This capability was removed here:
    http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00933.html
The followups to that message mention this type of problem.
Comment 16 Andrew Pinski 2004-12-02 16:10:26 UTC
PR 14179 is for the C++ problem.
Comment 17 Richard Henderson 2005-01-06 01:22:09 UTC
No chance this is getting done for 4.0.
Comment 18 Andrew Pinski 2005-07-22 18:36:06 UTC
Does anyone have the current numbers for this bug?
I know for C, the memory usage has gone down but I don't know by how much.
Comment 19 Andrew Pinski 2005-07-25 01:27:22 UTC
c-typeck.c:5987 (output_init_element)                     0: 0.0%   23955160:100.0%   22770552:20.9%   
13171408:99.1%         19
convert.c:671 (convert_to_integer)                 52184768:37.8%          0: 0.0%          0: 0.0%          0: 0.0%    
1630774
ggc-common.c:193 (ggc_calloc)                      33547596:24.3%          0: 0.0%   33577352:30.7%        
544: 0.0%         45
tree.c:828 (build_int_cst_wide)                    52176864:37.8%          0: 0.0%   52177536:47.8%          0: 
0.0%    3261075
Comment 20 Andrew Pinski 2005-07-25 01:30:38 UTC
There must be a better way to add on to celt in output_init_element.
Comment 21 Richard Biener 2005-09-12 08:55:23 UTC
Max memory usage on (checking-disabled) mainline is now 253149kB (on a machine
with 1GB of RAM) for C and 403669kB for C++ (!)
Comment 22 Richard Biener 2005-09-12 10:03:01 UTC
One problem is that we use integer tree nodes for counting from zero to N, which
is just stupid and wastes RAM (because we do not collect during building the
initializer).  Of course we also store that "index" in the initializer element
list.

This whole mess asks for a (less general) rewrite.  Minimal-invasive surgery
is impossible.
Comment 23 Giovanni Bajo 2005-09-12 10:08:58 UTC
The problem is that the gimplifier always want the index field of the 
constructor element to be filled. If you fix that in the obvious way (so 
that "no index" means "previous index + 1"), it should be quite easy to fix, 
for C++. In C, I have no clue how this interacts with designated initializers 
though.
Comment 24 Richard Biener 2005-10-03 17:54:32 UTC
It somehow works (partially), but there's a lot of fallout.  Ugh.  I don't like it very much.  Preliminary patch:

http://gcc.gnu.org/ml/gcc-patches/2005-10/msg00091.html
Comment 25 Richard Henderson 2005-10-11 19:24:04 UTC
I don't think we can reasonably attack this for 4.1.  This is something
that should be done during a stage 1.
Comment 26 Ian Lance Taylor 2005-10-11 19:30:08 UTC
Regression bugs should have target milestones.
Comment 27 niemayer 2006-12-13 11:37:19 UTC
I would like to mention that this problem seems to have worsened a lot for the current snapshots of gcc-4.2 (currently testing with 4.2.0 20061205 (prerelease)) when compiling with at least -O1 - maybe due to the static constant elimination?.

I tried to compile a Unicode normalization test C++ source that took gcc about 300MB of RAM before to compile with -O1 - now with gcc 4.2 I cannot compile this source anymore on a machine with 1 GB of physical + 1 GB of virtual RAM before the kernel OOM killer is killing cc1plus.

If somebody would like the source of my test-case, I can supply it.
Comment 28 Mark Mitchell 2007-05-14 22:25:25 UTC
Will not be fixed in 4.2.0; retargeting at 4.2.1.
Comment 29 niemayer 2007-05-15 16:54:26 UTC
That's sad - while memory gets cheaper, it has still not become cheap enough to cope with that huge increase in memory usage imposed by gcc 4.2. Seems I have to stick with 4.1 until that problem is fixed...
Comment 30 Pawel Sikora 2007-05-15 17:04:43 UTC
looks like related to PR30052.
Comment 31 Mark Mitchell 2007-10-09 19:20:52 UTC
Change target milestone to 4.2.3, as 4.2.2 has been released.
Comment 32 Richard Biener 2008-01-05 14:07:49 UTC
The difference between using gcc and g++ for the testcase seems to be gone on
the trunk, where gcc peaks at 480MB and g++ at 530MB.  For 4.1 g++ used 780MB.
Comment 33 Joseph S. Myers 2008-01-17 15:42:50 UTC
This memory use regression has been present since at least 3.3; at least part of it may be an unavoidable consequence of supporting C99 overriding in designated initializers; a proper fix would likely involve major changes to the datastructures for initializers (as RTH notes in comment#25, it's not suitable for a stage 3 fix); the priority seems to have been P2 from the start rather than having been set by an RM.  In view of these (but especially the likely unsuitability of a fix for stage 3), downgrading to P4 (the same as the corresponding C++ bug, bug 14179).
Comment 34 niemayer 2008-01-17 17:02:05 UTC
Can you suggest any kind of work-around? Any alternative to represent constant arrays in C/C++?

The problem with leaving this bug open indefinitely is that there are existing programs (as the Unicode-test-case I mentioned above) which will simply not compile on any reasonably equipped machine anymore.

I wouldn't mind to change the source code to represent the constant arrays in a different way, but I have not found a method yet (other than using platform dependend methods like generating assembler source).
Comment 35 Ian Lance Taylor 2008-01-18 06:37:37 UTC
The bug should certainly be fixed.  But it's unfortunately a lot of work for a small payoff--most people are not in your situation.  I think Joseph is correct in lowering the priority.  It's pointless for us to describe this bug as release-blocking, when it clearly is not.

The core problem is C99 designated initializers.  Those require us to read the entire array into memory before we emit any of it.  Otherwise we could generate the wrong code, and there is no way to recover.

So the only plausible fix is to optimize the memory representation used for large array initializers.
Comment 36 Joseph S. Myers 2008-02-01 16:52:34 UTC
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.
Comment 37 Joseph S. Myers 2008-05-19 20:22:29 UTC
4.2.4 is being released, changing milestones to 4.2.5.
Comment 38 Joseph S. Myers 2008-07-04 22:44:00 UTC
Closing 4.1 branch.
Comment 39 Richard Biener 2009-02-10 10:12:22 UTC
*** Bug 39142 has been marked as a duplicate of this bug. ***
Comment 40 Jan Hubicka 2009-02-21 12:40:31 UTC
I happen to have compiler with statistics around:
We still need about 400MB, mostly integer constants:
c-decl.c:473 (bind)                                  125040: 0.0%          0: 0.0%          0: 0.0%          0: 0.0%       2605
tree.c:5905 (build_function_type)                     13000: 0.0%          0: 0.0%     113400: 0.1%       5056: 0.0%        632
stringpool.c:73 (alloc_node)                           6032: 0.0%          0: 0.0%     174096: 0.1%      13856: 0.0%       1732
langhooks.c:543 (add_builtin_function_common)             0: 0.0%          0: 0.0%     442224: 0.2%      59760: 0.2%       1494
c-typeck.c:6472 (output_init_element)                     0: 0.0%   47910400:100.0%   45541112:23.7%   26342936:66.6%         19
convert.c:752 (convert_to_integer)                117415728:44.6%          0: 0.0%          0: 0.0%   13046192:33.0%    1630774
ggc-common.c:187 (ggc_calloc)                      67094608:25.5%          0: 0.0%   67162736:34.9%       1088: 0.0%         58
tree.c:1004 (build_int_cst_wide)                   78264768:29.8%          0: 0.0%   78266496:40.7%          0: 0.0%    3261068
Total                                             262986355         47910416        192171521         39527780          4905807
source location                                     Garbage            Freed             Leak         Overhead            Times


It seems that we produce awful amount of garbage during the initializer construction.  Perhaps by forcing ggc_collect there we can get down to 200MB that we need to reprezent it at the end?

Honza
Comment 41 rguenther@suse.de 2009-02-21 12:50:41 UTC
Subject: Re:  [4.2/4.3/4.4 regression] Uses lots of memory when
 compiling large initialized arrays

On Sat, 21 Feb 2009, hubicka at gcc dot gnu dot org wrote:

> ------- Comment #40 from hubicka at gcc dot gnu dot org  2009-02-21 12:40 -------
> I happen to have compiler with statistics around:
> We still need about 400MB, mostly integer constants:
> c-decl.c:473 (bind)                                  125040: 0.0%          0:
> 0.0%          0: 0.0%          0: 0.0%       2605
> tree.c:5905 (build_function_type)                     13000: 0.0%          0:
> 0.0%     113400: 0.1%       5056: 0.0%        632
> stringpool.c:73 (alloc_node)                           6032: 0.0%          0:
> 0.0%     174096: 0.1%      13856: 0.0%       1732
> langhooks.c:543 (add_builtin_function_common)             0: 0.0%          0:
> 0.0%     442224: 0.2%      59760: 0.2%       1494
> c-typeck.c:6472 (output_init_element)                     0: 0.0%  
> 47910400:100.0%   45541112:23.7%   26342936:66.6%         19
> convert.c:752 (convert_to_integer)                117415728:44.6%          0:
> 0.0%          0: 0.0%   13046192:33.0%    1630774
> ggc-common.c:187 (ggc_calloc)                      67094608:25.5%          0:
> 0.0%   67162736:34.9%       1088: 0.0%         58
> tree.c:1004 (build_int_cst_wide)                   78264768:29.8%          0:
> 0.0%   78266496:40.7%          0: 0.0%    3261068
> Total                                             262986355         47910416   
>     192171521         39527780          4905807
> source location                                     Garbage            Freed   
>          Leak         Overhead            Times
> 
> 
> It seems that we produce awful amount of garbage during the initializer
> construction.  Perhaps by forcing ggc_collect there we can get down to 200MB
> that we need to reprezent it at the end?

We need the integer csts in the constructor lists.  I have a patch
somewhere (or is it even attached?) that tries to do index compression
and not use the integer csts for counting.  Didn't work out too much
though.

Richard.
Comment 42 Jan Hubicka 2009-02-22 11:21:08 UTC
Actual representation of constructor don't seem to be major problem here.

We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and convert integer to proper type, so counting in INTEGER_CSTs should be just slightly less than half of memory needed.  This seems quite silly.

The patch to not use HOST_WIDE_INT or similar for counting should save another 70MB of garbage (and speed up compilation), so perhaps you could dig it out? :))

Following patch:
Index: convert.c
===================================================================
--- convert.c   (revision 144352)
+++ convert.c   (working copy)
@@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr
          break;
        }
 
+      /* When parsing long initializers, we might end up with a lot of casts.
+         Shortcut this.  */
+      if (TREE_CODE (expr) == INTEGER_CST)
+       return fold_unary (CONVERT_EXPR, type, expr);
+
       return build1 (CONVERT_EXPR, type, expr);
 
     case REAL_TYPE:

Cuts gabrage production in half:
c-typeck.c:6472 (output_init_element)                     0: 0.0%   47910400:100.0%   45541112:23.7%   26342936:99.5%         19
ggc-common.c:187 (ggc_calloc)                      67094608:46.1%          0: 0.0%   67162736:34.9%       1088: 0.0%         58
tree.c:1004 (build_int_cst_wide)                   78264768:53.8%          0: 0.0%   78266496:40.7%          0: 0.0%    3261068
Total                                             145570627         47910416        192171521         26481588          3275033
source location                                     Garbage            Freed             Leak         Overhead            Times

I will give the patch testing, but I am not too hopeful it will just work. ;)

Honza
Comment 43 rguenther@suse.de 2009-02-22 19:03:54 UTC
Subject: Re:  [4.2/4.3/4.4 regression] Uses lots of memory when
 compiling large initialized arrays

On Sun, 22 Feb 2009, hubicka at gcc dot gnu dot org wrote:

> Actual representation of constructor don't seem to be major problem here.
> 
> We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and
> convert integer to proper type, so counting in INTEGER_CSTs should be just
> slightly less than half of memory needed.  This seems quite silly.
> 
> The patch to not use HOST_WIDE_INT or similar for counting should save another
> 70MB of garbage (and speed up compilation), so perhaps you could dig it out?
> :))
> 
> Following patch:
> Index: convert.c
> ===================================================================
> --- convert.c   (revision 144352)
> +++ convert.c   (working copy)
> @@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr
>           break;
>         }
> 
> +      /* When parsing long initializers, we might end up with a lot of casts.
> +         Shortcut this.  */
> +      if (TREE_CODE (expr) == INTEGER_CST)
> +       return fold_unary (CONVERT_EXPR, type, expr);

fold_convert ().  But maybe not valid to do here for C std reasons, who 
knows.

> +
>        return build1 (CONVERT_EXPR, type, expr);

And probably just generally using fold_convert () would be ok as well.
Maybe they are there to make sure to build rvalues.

>      case REAL_TYPE:
> 
> Cuts gabrage production in half:
> c-typeck.c:6472 (output_init_element)                     0: 0.0%  
> 47910400:100.0%   45541112:23.7%   26342936:99.5%         19
> ggc-common.c:187 (ggc_calloc)                      67094608:46.1%          0:
> 0.0%   67162736:34.9%       1088: 0.0%         58
> tree.c:1004 (build_int_cst_wide)                   78264768:53.8%          0:
> 0.0%   78266496:40.7%          0: 0.0%    3261068
> Total                                             145570627         47910416   
>     192171521         26481588          3275033
> source location                                     Garbage            Freed   
>          Leak         Overhead            Times
> 
Comment 44 Jan Hubicka 2009-02-23 13:41:53 UTC
Hi,
I believe that using fold_convert instead of fold_build1 means that we would bypass folding done in fold_unary that handles stuff like two conversions in a row while fold_convert is primarily about returning constant when result is constant.

Since I want to avoid wrapping fold calls all frontends except for C++ consistently put around convert_to_* calls, I want to do this kind of folding.

I believe only reason to avoid folding is C++ template stuff.
Comment 45 Jan Hubicka 2009-02-23 16:46:48 UTC
Subject: Bug 12245

Author: hubicka
Date: Mon Feb 23 16:46:32 2009
New Revision: 144384

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144384
Log:
	PR c/12245
	* ggc.h (htab_create_ggc): Use ggc_free to free hashtable when resizing.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ggc.h

Comment 46 Joseph S. Myers 2009-03-31 16:13:26 UTC
Closing 4.2 branch.
Comment 47 Richard Biener 2009-08-04 12:25:56 UTC
GCC 4.3.4 is being released, adjusting target milestone.
Comment 48 Richard Biener 2010-05-22 18:09:57 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 49 Richard Biener 2011-06-27 12:12:05 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 50 Jason Merrill 2012-01-13 20:19:41 UTC
I can't think of any reason not to fold conversion of an INTEGER_CST to a different integer type.
Comment 51 Jason Merrill 2012-01-16 16:40:48 UTC
Author: jason
Date: Mon Jan 16 16:40:38 2012
New Revision: 183214

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183214
Log:
	PR c/12245
	PR c++/14179
	* convert.c (convert_to_integer): Use fold_convert for
	converting an INTEGER_CST to integer type.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/convert.c
Comment 52 Jakub Jelinek 2012-03-13 12:44:53 UTC
4.4 branch is being closed, moving to 4.5.4 target.
Comment 53 Jakub Jelinek 2013-04-12 15:15:36 UTC
GCC 4.6.4 has been released and the branch has been closed.
Comment 54 Richard Biener 2014-06-12 13:41:22 UTC
The 4.7 branch is being closed, moving target milestone to 4.8.4.
Comment 55 Jakub Jelinek 2014-12-19 13:24:27 UTC
GCC 4.8.4 has been released.
Comment 56 Richard Biener 2015-03-06 13:37:03 UTC
For GCC 5 using C requires 210MB ram, using C++ 430MB.
Comment 57 Richard Biener 2015-06-23 08:13:14 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 58 Jakub Jelinek 2015-06-26 19:51:40 UTC
GCC 4.9.3 has been released.
Comment 59 Franc[e]sco 2016-04-25 17:44:07 UTC
I can confirm this still happens in gcc 4.9.3 (gentoo linux, amd64), here's an example: https://ideone.com/J699eJ
Comment 60 Richard Biener 2016-04-26 09:24:46 UTC
GCC 5 should have improved things a bit by wide-ints as small INTEGER_CSTs now
use 8 bytes less memory.  A quick check with GCC 6 shows around ~200MB memory
use for the attached testcase.

Note that your last example uses C++ initializer lists which have its own
issue (quite expensive wrapping) and separate bugreports.
Comment 61 Richard Biener 2016-08-03 08:37:18 UTC
GCC 4.9 branch is being closed
Comment 62 Richard Biener 2017-02-01 13:14:53 UTC
Main issue is still for GCC:

Kind                   Nodes      Bytes
----------------------------------------
constants            1630852   39140573

integer_cst                      1630844


c/c-typeck.c:9020 (output_init_element)                   0:  0.0%  33554552: 50.0%  33554440: 31.2%       152:  0.2%        20

and for G++:

Kind                   Nodes      Bytes
----------------------------------------
constants            1630864   39140861

integer_cst                      1630856


cp/constexpr.c:4814 (maybe_constant_value)         67108816:100.0% 100663104        17:  0.0%       ggc

(huh!)

cp/parser.c:21811 (cp_parser_initializer_list)     33554440: 99.8%  33554552:  8.3%         0:  0.0%       152:  0.1%        20


that maybe_constant_value can be improved to

cp/constexpr.c:4817 (maybe_constant_value)             2032: 13.6%      2144         2:  0.0%       ggc

with a simple patch.
Comment 63 Richard Biener 2017-02-01 13:29:58 UTC
Sth that could pay off with other testcases (nested CONSTRUCTORs) is to truncate the size of the CONSTRUCTOR_ELTS vec<> to the exact final size after parsing it
as it will never grow again and we over-allocate during safe-pushing to it.

vec:: has no suitable function to do that (yet) though.

It won't help this particular testcase of course.
Comment 64 Jason Merrill 2017-02-03 19:44:59 UTC
Author: jason
Date: Fri Feb  3 19:44:27 2017
New Revision: 245169

URL: https://gcc.gnu.org/viewcvs?rev=245169&root=gcc&view=rev
Log:
	PR c++/12245 - excessive memory use

	* constexpr.c (maybe_constant_value): Fold maybe_constant_value_1
	back in.  Don't cache constants.
	(maybe_constant_init): Don't cache constants.

Modified:
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/constexpr.c
Comment 65 Jakub Jelinek 2018-10-26 10:18:17 UTC
GCC 6 branch is being closed
Comment 66 Frank Ch. Eigler 2019-02-27 01:11:39 UTC
Just in case it helps, we are encountering this problem with fedora29's gcc 8.2.1,
when compiling a 24-million unsigned-char initialized array:

% gcc -c -Q -v foo.i
[...]

Time variable                                   usr           sys          wall               GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)    1243 kB (  0%)
 phase parsing                      :  25.26 ( 83%)  26.12 (100%)  51.47 ( 90%) 2592523 kB (100%)
 phase opt and generate             :   5.32 ( 17%)   0.08 (  0%)   5.42 ( 10%)       7 kB (  0%)
 phase finalize                     :   0.00 (  0%)   0.02 (  0%)   0.13 (  0%)       0 kB (  0%)
 garbage collection                 :   1.27 (  4%)   0.00 (  0%)   1.27 (  2%)       0 kB (  0%)
 callgraph construction             :   4.05 ( 13%)   0.08 (  0%)   4.15 (  7%)       5 kB (  0%)
 preprocessing                      :   5.99 ( 20%)   6.39 ( 24%)  12.20 ( 21%)  524289 kB ( 20%)
 lexical analysis                   :   7.34 ( 24%)   8.90 ( 34%)  16.18 ( 28%)       0 kB (  0%)
 parser (global)                    :  11.93 ( 39%)  10.83 ( 41%)  23.09 ( 40%) 2068233 kB ( 80%)
 TOTAL                              :  30.58         26.24         57.05        2593783 kB
Comment 67 Jakub Jelinek 2019-02-27 19:36:41 UTC
Are the values completely random or are there big chunks with the same values?
Recently in some cases we use RANGE_EXPR to shrink the CONSTRUCTOR sizes if values are repeated.
Comment 68 Frank Ch. Eigler 2019-02-27 19:52:12 UTC
(In reply to Jakub Jelinek from comment #67)
> Are the values completely random or are there big chunks with the same
> values?

I'd suspect pretty random, considering that gzip of the 
generated source code compresses by only 80%.  In the case
of the systemtap example, it's approximately a byte dump of the
.debug_line section, which is relatively efficiently encoded,
ergo incompressible.
Comment 69 rguenther@suse.de 2019-02-27 20:27:54 UTC
On February 27, 2019 8:52:12 PM GMT+01:00, fche at redhat dot com <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12245
>
>--- Comment #68 from Frank Ch. Eigler <fche at redhat dot com> ---
>(In reply to Jakub Jelinek from comment #67)
>> Are the values completely random or are there big chunks with the
>same
>> values?
>
>I'd suspect pretty random, considering that gzip of the 
>generated source code compresses by only 80%.  In the case
>of the systemtap example, it's approximately a byte dump of the
>.debug_line section, which is relatively efficiently encoded,
>ergo incompressible.

We could add a NATIVE_ENCODE_RANGE_EXPR that encodes a contiguous range of bytes in native target representation. Of course that has to be kept throughout GIMPLE.
Comment 70 Frank Ch. Eigler 2019-02-27 21:04:18 UTC
> We could add a NATIVE_ENCODE_RANGE_EXPR that encodes a contiguous range of
> bytes in native target representation. Of course that has to be kept
> throughout GIMPLE.

(Just a silly spitballing here ... but if such a native target representation is
not processed again before being sent to the assembler, it could even be stored compressed.)
Comment 71 Richard Biener 2019-03-01 08:35:59 UTC
(In reply to Frank Ch. Eigler from comment #70)
> > We could add a NATIVE_ENCODE_RANGE_EXPR that encodes a contiguous range of
> > bytes in native target representation. Of course that has to be kept
> > throughout GIMPLE.
> 
> (Just a silly spitballing here ... but if such a native target
> representation is
> not processed again before being sent to the assembler, it could even be
> stored compressed.)

One step at a time - but sure.  Note that we _do_ inspect the data for
constant folding so whether to compress needs to be evaluated on a case-by-case
basis (only initializers to non-constant objects for example?)
Comment 72 Jakub Jelinek 2019-03-01 08:48:48 UTC
(In reply to Richard Biener from comment #71)
> (In reply to Frank Ch. Eigler from comment #70)
> > > We could add a NATIVE_ENCODE_RANGE_EXPR that encodes a contiguous range of
> > > bytes in native target representation. Of course that has to be kept
> > > throughout GIMPLE.
> > 
> > (Just a silly spitballing here ... but if such a native target
> > representation is
> > not processed again before being sent to the assembler, it could even be
> > stored compressed.)
> 
> One step at a time - but sure.  Note that we _do_ inspect the data for
> constant folding so whether to compress needs to be evaluated on a
> case-by-case
> basis (only initializers to non-constant objects for example?)

For anything we need to be able to access it easily, say if you have
int a[2][100000000] = { { huge NATIVE_ENCODE_RANGE_EXPR initializer here }, [0][42] = 42 };

For the non-compressed target dependent initializer we actually have a tree already, STRING_CST, and we actually since PR71625 use it for char/signed char/unsigned char array initializers, but decide to use it and convert to it only after the initializer parsing is done, while to avoid using lots of memory we'd need to decide for that already during parsing, say after parsing a couple hundreds or thousands elements.  And we might consider using it for other types as well and just natively encode/decode stuff from/to the STRING_CST as needed.
Comment 73 rguenther@suse.de 2019-03-01 08:52:51 UTC
On Fri, 1 Mar 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12245
> 
> --- Comment #72 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #71)
> > (In reply to Frank Ch. Eigler from comment #70)
> > > > We could add a NATIVE_ENCODE_RANGE_EXPR that encodes a contiguous range of
> > > > bytes in native target representation. Of course that has to be kept
> > > > throughout GIMPLE.
> > > 
> > > (Just a silly spitballing here ... but if such a native target
> > > representation is
> > > not processed again before being sent to the assembler, it could even be
> > > stored compressed.)
> > 
> > One step at a time - but sure.  Note that we _do_ inspect the data for
> > constant folding so whether to compress needs to be evaluated on a
> > case-by-case
> > basis (only initializers to non-constant objects for example?)
> 
> For anything we need to be able to access it easily, say if you have
> int a[2][100000000] = { { huge NATIVE_ENCODE_RANGE_EXPR initializer here },
> [0][42] = 42 };
> 
> For the non-compressed target dependent initializer we actually have a tree
> already, STRING_CST, and we actually since PR71625 use it for char/signed
> char/unsigned char array initializers, but decide to use it and convert to it
> only after the initializer parsing is done, while to avoid using lots of memory
> we'd need to decide for that already during parsing, say after parsing a couple
> hundreds or thousands elements.  And we might consider using it for other types
> as well and just natively encode/decode stuff from/to the STRING_CST as needed.

Yes, we'd usually not end up with a single NATIVE_ENCODE_RANGE_EXPR but
we need to create that "block-wise" to have any savings.  IIRC part of the
reason for the bloat was that we require constructor indices to be
present even for contiguous elements which means having INTEGER_CSTs
counting from zero to very large.  IIRC I had some partial patches that
tried to delay actual constructor element creation for contiguous elements
but somehow it didn't work out - and it would break (not save anything)
once you start using designated initializers...
Comment 74 Richard Biener 2019-11-14 07:48:33 UTC
The GCC 7 branch is being closed, re-targeting to GCC 8.4.
Comment 75 JeanHeyd Meneide 2019-12-29 04:28:25 UTC
I would like to add to this post. I experience severe memory usage and compilation time consumption that ramps up heavily when dealing with binary data. I documented much of my struggles here:

https://thephd.github.io/embed-the-details

I am being told that the functionality I am developing is more suited for a bug report and that this should be compiler QoI. Upon attempting to file this bug, I decided to throw my own data and woes into the ring here.

Is there a place I should start looking to help out with this? I would like to start getting closer to the theoretical near-perfect overhead of dealing with what essentially ends up being a large binary payload, without resorting to #embed or any special builtins.
Comment 76 Jakub Jelinek 2020-03-04 09:40:13 UTC
GCC 8.4.0 has been released, adjusting target milestone.
Comment 77 Trass3r 2020-05-26 19:42:12 UTC
This kind of code is also heavily used by Qt's resource system so any compile time improvements are welcome.
Comment 78 Jakub Jelinek 2021-05-14 09:45:10 UTC
GCC 8 branch is being closed.
Comment 79 Richard Biener 2021-06-01 08:03:30 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 80 Richard Biener 2022-05-27 09:33:03 UTC
GCC 9 branch is being closed
Comment 81 Jakub Jelinek 2022-06-28 10:28:44 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 82 Carlos Galvez 2022-07-05 13:07:40 UTC
Hi,

This bug is still present in GCC 11.3.0. My use case is using large std::arrays. NOTE: the problem immediately goes away if the arrays are not initialized, but naturally we want to always initialize our variables to prevent accessing uninitialized data:

-std::array<Foo, 1000000> data{};
+std::array<Foo, 1000000> data;
Comment 83 Andrew Pinski 2023-06-05 21:25:41 UTC
(In reply to Carlos Galvez from comment #82)
> Hi,
> 
> This bug is still present in GCC 11.3.0. My use case is using large
> std::arrays. NOTE: the problem immediately goes away if the arrays are not
> initialized, but naturally we want to always initialize our variables to
> prevent accessing uninitialized data:
> 
> -std::array<Foo, 1000000> data{};
> +std::array<Foo, 1000000> data;

Note the C++ issue listed in comment #82 is a different issue and I think was improved for GCC 13.
Comment 84 Richard Biener 2023-07-07 10:28:08 UTC
GCC 10 branch is being closed.
Comment 85 Richard Biener 2024-07-19 12:52:45 UTC
GCC 11 branch is being closed.
Comment 86 Jakub Jelinek 2024-12-18 17:54:21 UTC
The #c13 testcase should be fixed on the trunk (first gcc 14 C and C++ times, then current trunk C and C++ times):
for i in /usr/src/gcc-14/obj02/gcc/cc1{,plus} /usr/src/gcc/obj14/cc1{,plus}; do time $i -quiet -O2 foo.i ; done

real	0m2.180s
user	0m2.088s
sys	0m0.078s

real	0m2.798s
user	0m2.717s
sys	0m0.076s

real	0m0.401s
user	0m0.371s
sys	0m0.020s

real	0m0.351s
user	0m0.337s
sys	0m0.013s
Comment 87 Ian Lance Taylor 2024-12-18 18:02:03 UTC
Nice, thanks.
Comment 88 Jakub Jelinek 2024-12-18 18:41:51 UTC
In particular without #embed in the source with the
r15-4377-gf9bac238840155e1539aa68daf1507ea63c9ed80
change for C and
r15-6339-g40f243e91796671701ded90919d1ca32ba9076ad
for C++.