This is GCC Bugzilla
This is GCC Bugzilla Version 2.20+
View Bug Activity | Format For Printing | Clone This Bug
This is a simple program with large initialized static arrays (20 arrays of 200,000 elements each, plus a single array with 500,000 elements. Compiling crashes the compiler after about an hour (900 MHZ system). Reproduce by typing "Make" (Makefile included). Actual results are listed below. Compiling with -Wall shows no warnings. I really hate to do this to you, but I believe that the problem is related to initializing very large arrays. Consequently, I cannot generate a small program which illustrates the problem (and it takes an hour for each experiment, which doesn't help either). My minimal solution is a C source file of 152 lines (inconsequential), and two include files of 200,000 lines and 500,000 lines each. The include files contain initialization data for 21 arrays, and are nothing more than a list of numbers and commas. The total source and intermediate files are so BIG that I have not included them here. I know I'm not supposed to send archives and I'm *really* not supposed to ask you to download the test cases from the net, but in this instance I think it's appropriate. You can get the complete test set (1 source, 2 includes, Makefile, and saved intermediate file) at www.OkianWarrior.com/gccBug.tar.gz /home/kibaro/tmp: make gcc -v -save-temps -o CSolv CSolv.c Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/specs Configured with: ./configure Thread model: posix gcc version 3.3.1 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -E -quiet -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=1 CSolv.c CSolv.i ignoring nonexistent directory "NONE/include" ignoring nonexistent directory "/usr/local/i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/include /usr/include End of search list. /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -fpreprocessed CSolv.i -quiet -dumpbase CSolv.c -auxbase CSolv -version -o CSolv.s GNU C version 3.3.1 (i686-pc-linux-gnu) compiled by GNU C version 3.2 (Mandrake Linux 9.0 3.2-1mdk). GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32119 gcc: Internal error: Killed (program cc1) Please submit a full bug report. See <URL:http://gcc.gnu.org/bugs.html> for instructions. make: *** [CSolv] Error 1
Please attach the preprocessed files to this bug.
Subject: Re: Crashes when compiling large initialized arrays (gccBug: message 3 of 9) > Please attach the preprocessed files to this bug. The file is too big to send to Bugzilla. As mentioned in the bug posting, the file is available for http download here: www.OkianWarrior.com/gccBug.tar.gz I've read the guidelines for posting bugs. I know I'm not supposed to post archives, or links to archives, or multiple file examples. I know all this. I believe that this is an exception for reasons stated in the bug description, and I ask that you bear with me. R. Barrabas ================================================== My younger brother asked me what happens after we die. I told him we get buried under a bunch of dirt and worms eat our bodies. I guess I should have told him the truth-that most of us go to hell and burn eternally - but I didn't want to upset him.
Subject: Re: Crashes when compiling large initialized arrays (gccBug: message 3 of 9) > ------- Additional Comments From pinskia at gcc dot gnu dot org 2003-09-11 15:53 ------- > Please attach the preprocessed files to this bug. I would guess that the bug is caused by: 1) The compiler allocates lots of storage for intermediate results. 2) Virtual memory gets used up, and the next allocations fails. 3) The allocation is not checked, leading to eventual failure. R. Barrabas ================================================== Dictatorship (n): a form of government under which everything which is not prohibited is compulsory.
How much memory do you have?
Subject: Re: Crashes when compiling large initialized arrays (gccBug: message 7 of 9) > How much memory do you have? 256MB of ram + 256MB of swap. ================================================== Those who live by the sword get shot by those who don't.
Must be a memory intensive as I cannot reproduce on a system with 1GB of memory.
Not GCC problem that the OS returns a non-zero pointer when memory is full.
Though on the other hand GCC should not be such a hog of memory.
Note that one day the web server will be down and "we" (meaning GCC developers) cannot access the testcase so we will ask you for the testcase to be attached, can you just attach the preprocessed source.
The testcase takes about 445M on i686-pc-linux-gnu and more than 500M on powerpc- apple-darwin7.0.0. Will attach testcase.
Still is a problem on the mainline, targeting 3.5.0 for now.
The URL of the test case doesn't seem to work anymore. Does anybody still have the test case?
Created an attachment (id=7660) [edit] Simple test case
We take with the attached testcase about 300M with the C front-end but a huge amount more for the C++ front-end, why?
I attached a simple test case. This is based on real existing code, although I changed all the values to hide potentially proprietary information. When I compile this file without optimization, it uses some 200M, and garbage collects while compiling this file. The compilation takes 1 minutes, 45 seconds. (This is much better than gcc 3.4.3, actually, which used all available memory, garbage collected twice, and wound up swapping for 10 minutes or so before completing). When compiling with 2.95.3, the compiler uses 20M and completes in 37 seconds. The compiler used to work fine when processing very large initializers. As it read the initializer, gcc would output the initializer to the assembler file directly. This capability was removed here: http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00933.html The followups to that message mention this type of problem.
PR 14179 is for the C++ problem.
No chance this is getting done for 4.0.
Does anyone have the current numbers for this bug? I know for C, the memory usage has gone down but I don't know by how much.
c-typeck.c:5987 (output_init_element) 0: 0.0% 23955160:100.0% 22770552:20.9% 13171408:99.1% 19 convert.c:671 (convert_to_integer) 52184768:37.8% 0: 0.0% 0: 0.0% 0: 0.0% 1630774 ggc-common.c:193 (ggc_calloc) 33547596:24.3% 0: 0.0% 33577352:30.7% 544: 0.0% 45 tree.c:828 (build_int_cst_wide) 52176864:37.8% 0: 0.0% 52177536:47.8% 0: 0.0% 3261075
There must be a better way to add on to celt in output_init_element.
Max memory usage on (checking-disabled) mainline is now 253149kB (on a machine with 1GB of RAM) for C and 403669kB for C++ (!)
One problem is that we use integer tree nodes for counting from zero to N, which is just stupid and wastes RAM (because we do not collect during building the initializer). Of course we also store that "index" in the initializer element list. This whole mess asks for a (less general) rewrite. Minimal-invasive surgery is impossible.
The problem is that the gimplifier always want the index field of the constructor element to be filled. If you fix that in the obvious way (so that "no index" means "previous index + 1"), it should be quite easy to fix, for C++. In C, I have no clue how this interacts with designated initializers though.
It somehow works (partially), but there's a lot of fallout. Ugh. I don't like it very much. Preliminary patch: http://gcc.gnu.org/ml/gcc-patches/2005-10/msg00091.html
I don't think we can reasonably attack this for 4.1. This is something that should be done during a stage 1.
Regression bugs should have target milestones.
I would like to mention that this problem seems to have worsened a lot for the current snapshots of gcc-4.2 (currently testing with 4.2.0 20061205 (prerelease)) when compiling with at least -O1 - maybe due to the static constant elimination?. I tried to compile a Unicode normalization test C++ source that took gcc about 300MB of RAM before to compile with -O1 - now with gcc 4.2 I cannot compile this source anymore on a machine with 1 GB of physical + 1 GB of virtual RAM before the kernel OOM killer is killing cc1plus. If somebody would like the source of my test-case, I can supply it.
Will not be fixed in 4.2.0; retargeting at 4.2.1.
That's sad - while memory gets cheaper, it has still not become cheap enough to cope with that huge increase in memory usage imposed by gcc 4.2. Seems I have to stick with 4.1 until that problem is fixed...
looks like related to PR30052.
Change target milestone to 4.2.3, as 4.2.2 has been released.
The difference between using gcc and g++ for the testcase seems to be gone on the trunk, where gcc peaks at 480MB and g++ at 530MB. For 4.1 g++ used 780MB.
This memory use regression has been present since at least 3.3; at least part of it may be an unavoidable consequence of supporting C99 overriding in designated initializers; a proper fix would likely involve major changes to the datastructures for initializers (as RTH notes in comment#25, it's not suitable for a stage 3 fix); the priority seems to have been P2 from the start rather than having been set by an RM. In view of these (but especially the likely unsuitability of a fix for stage 3), downgrading to P4 (the same as the corresponding C++ bug, bug 14179).
Can you suggest any kind of work-around? Any alternative to represent constant arrays in C/C++? The problem with leaving this bug open indefinitely is that there are existing programs (as the Unicode-test-case I mentioned above) which will simply not compile on any reasonably equipped machine anymore. I wouldn't mind to change the source code to represent the constant arrays in a different way, but I have not found a method yet (other than using platform dependend methods like generating assembler source).
The bug should certainly be fixed. But it's unfortunately a lot of work for a small payoff--most people are not in your situation. I think Joseph is correct in lowering the priority. It's pointless for us to describe this bug as release-blocking, when it clearly is not. The core problem is C99 designated initializers. Those require us to read the entire array into memory before we emit any of it. Otherwise we could generate the wrong code, and there is no way to recover. So the only plausible fix is to optimize the memory representation used for large array initializers.
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.
4.2.4 is being released, changing milestones to 4.2.5.
Closing 4.1 branch.
*** Bug 39142 has been marked as a duplicate of this bug. ***
I happen to have compiler with statistics around: We still need about 400MB, mostly integer constants: c-decl.c:473 (bind) 125040: 0.0% 0: 0.0% 0: 0.0% 0: 0.0% 2605 tree.c:5905 (build_function_type) 13000: 0.0% 0: 0.0% 113400: 0.1% 5056: 0.0% 632 stringpool.c:73 (alloc_node) 6032: 0.0% 0: 0.0% 174096: 0.1% 13856: 0.0% 1732 langhooks.c:543 (add_builtin_function_common) 0: 0.0% 0: 0.0% 442224: 0.2% 59760: 0.2% 1494 c-typeck.c:6472 (output_init_element) 0: 0.0% 47910400:100.0% 45541112:23.7% 26342936:66.6% 19 convert.c:752 (convert_to_integer) 117415728:44.6% 0: 0.0% 0: 0.0% 13046192:33.0% 1630774 ggc-common.c:187 (ggc_calloc) 67094608:25.5% 0: 0.0% 67162736:34.9% 1088: 0.0% 58 tree.c:1004 (build_int_cst_wide) 78264768:29.8% 0: 0.0% 78266496:40.7% 0: 0.0% 3261068 Total 262986355 47910416 192171521 39527780 4905807 source location Garbage Freed Leak Overhead Times It seems that we produce awful amount of garbage during the initializer construction. Perhaps by forcing ggc_collect there we can get down to 200MB that we need to reprezent it at the end? Honza
Subject: Re: [4.2/4.3/4.4 regression] Uses lots of memory when compiling large initialized arrays On Sat, 21 Feb 2009, hubicka at gcc dot gnu dot org wrote: > ------- Comment #40 from hubicka at gcc dot gnu dot org 2009-02-21 12:40 ------- > I happen to have compiler with statistics around: > We still need about 400MB, mostly integer constants: > c-decl.c:473 (bind) 125040: 0.0% 0: > 0.0% 0: 0.0% 0: 0.0% 2605 > tree.c:5905 (build_function_type) 13000: 0.0% 0: > 0.0% 113400: 0.1% 5056: 0.0% 632 > stringpool.c:73 (alloc_node) 6032: 0.0% 0: > 0.0% 174096: 0.1% 13856: 0.0% 1732 > langhooks.c:543 (add_builtin_function_common) 0: 0.0% 0: > 0.0% 442224: 0.2% 59760: 0.2% 1494 > c-typeck.c:6472 (output_init_element) 0: 0.0% > 47910400:100.0% 45541112:23.7% 26342936:66.6% 19 > convert.c:752 (convert_to_integer) 117415728:44.6% 0: > 0.0% 0: 0.0% 13046192:33.0% 1630774 > ggc-common.c:187 (ggc_calloc) 67094608:25.5% 0: > 0.0% 67162736:34.9% 1088: 0.0% 58 > tree.c:1004 (build_int_cst_wide) 78264768:29.8% 0: > 0.0% 78266496:40.7% 0: 0.0% 3261068 > Total 262986355 47910416 > 192171521 39527780 4905807 > source location Garbage Freed > Leak Overhead Times > > > It seems that we produce awful amount of garbage during the initializer > construction. Perhaps by forcing ggc_collect there we can get down to 200MB > that we need to reprezent it at the end? We need the integer csts in the constructor lists. I have a patch somewhere (or is it even attached?) that tries to do index compression and not use the integer csts for counting. Didn't work out too much though. Richard.
Actual representation of constructor don't seem to be major problem here. We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and convert integer to proper type, so counting in INTEGER_CSTs should be just slightly less than half of memory needed. This seems quite silly. The patch to not use HOST_WIDE_INT or similar for counting should save another 70MB of garbage (and speed up compilation), so perhaps you could dig it out? :)) Following patch: Index: convert.c =================================================================== --- convert.c (revision 144352) +++ convert.c (working copy) @@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr break; } + /* When parsing long initializers, we might end up with a lot of casts. + Shortcut this. */ + if (TREE_CODE (expr) == INTEGER_CST) + return fold_unary (CONVERT_EXPR, type, expr); + return build1 (CONVERT_EXPR, type, expr); case REAL_TYPE: Cuts gabrage production in half: c-typeck.c:6472 (output_init_element) 0: 0.0% 47910400:100.0% 45541112:23.7% 26342936:99.5% 19 ggc-common.c:187 (ggc_calloc) 67094608:46.1% 0: 0.0% 67162736:34.9% 1088: 0.0% 58 tree.c:1004 (build_int_cst_wide) 78264768:53.8% 0: 0.0% 78266496:40.7% 0: 0.0% 3261068 Total 145570627 47910416 192171521 26481588 3275033 source location Garbage Freed Leak Overhead Times I will give the patch testing, but I am not too hopeful it will just work. ;) Honza
Subject: Re: [4.2/4.3/4.4 regression] Uses lots of memory when compiling large initialized arrays On Sun, 22 Feb 2009, hubicka at gcc dot gnu dot org wrote: > Actual representation of constructor don't seem to be major problem here. > > We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and > convert integer to proper type, so counting in INTEGER_CSTs should be just > slightly less than half of memory needed. This seems quite silly. > > The patch to not use HOST_WIDE_INT or similar for counting should save another > 70MB of garbage (and speed up compilation), so perhaps you could dig it out? > :)) > > Following patch: > Index: convert.c > =================================================================== > --- convert.c (revision 144352) > +++ convert.c (working copy) > @@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr > break; > } > > + /* When parsing long initializers, we might end up with a lot of casts. > + Shortcut this. */ > + if (TREE_CODE (expr) == INTEGER_CST) > + return fold_unary (CONVERT_EXPR, type, expr); fold_convert (). But maybe not valid to do here for C std reasons, who knows. > + > return build1 (CONVERT_EXPR, type, expr); And probably just generally using fold_convert () would be ok as well. Maybe they are there to make sure to build rvalues. > case REAL_TYPE: > > Cuts gabrage production in half: > c-typeck.c:6472 (output_init_element) 0: 0.0% > 47910400:100.0% 45541112:23.7% 26342936:99.5% 19 > ggc-common.c:187 (ggc_calloc) 67094608:46.1% 0: > 0.0% 67162736:34.9% 1088: 0.0% 58 > tree.c:1004 (build_int_cst_wide) 78264768:53.8% 0: > 0.0% 78266496:40.7% 0: 0.0% 3261068 > Total 145570627 47910416 > 192171521 26481588 3275033 > source location Garbage Freed > Leak Overhead Times >
Hi, I believe that using fold_convert instead of fold_build1 means that we would bypass folding done in fold_unary that handles stuff like two conversions in a row while fold_convert is primarily about returning constant when result is constant. Since I want to avoid wrapping fold calls all frontends except for C++ consistently put around convert_to_* calls, I want to do this kind of folding. I believe only reason to avoid folding is C++ template stuff.
Subject: Bug 12245 Author: hubicka Date: Mon Feb 23 16:46:32 2009 New Revision: 144384 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144384 Log: PR c/12245 * ggc.h (htab_create_ggc): Use ggc_free to free hashtable when resizing. Modified: trunk/gcc/ChangeLog trunk/gcc/ggc.h
Closing 4.2 branch.
GCC 4.3.4 is being released, adjusting target milestone.