Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug
Bug#: 12245
Product:  
Component:  
Status: NEW
Resolution:
Assigned To: Not yet assigned to anyone <unassigned@gcc.gnu.org>
Host:
Reported against  
Priority:  
Severity:  
Target Milestone:  
 
 
Target:
Reporter: Rajstennaj Barrabas <gccBug.9.OkianWarrior@SpamGourmet.com>
Add CC:
CC:
Remove selected CCs
Build:
URL:
Summary:
Keywords:
Known to work:
Known to fail:

Attachment Description Type Created Size Actions
foo.i.gz Simple test case application/x-gzip 2004-12-02 15:31 215 bytes Edit
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 12245 depends on: 14179 Show dependency tree
Show dependency graph
Bug 12245 blocks:

Additional Comments:





Mark bug as waiting for feedback
Mark bug as suspended




View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Last confirmed: 2008-01-05 14:07 Opened: 2003-09-11 06:56
This is a simple program with large initialized static arrays (20 arrays of
200,000 elements each, plus a single array with 500,000 elements.

Compiling crashes the compiler after about an hour (900 MHZ system).
Reproduce by typing "Make" (Makefile included).
Actual results are listed below.
Compiling with -Wall shows no warnings.

I really hate to do this to you, but I believe that the problem is related to
initializing very large arrays. Consequently, I cannot generate a small program
which illustrates the problem (and it takes an hour for each experiment, which
doesn't help either). My minimal solution is a C source file of 152 lines
(inconsequential), and two include files of 200,000 lines and 500,000 lines
each. The include files contain initialization data for 21 arrays, and are
nothing more than a list of numbers and commas.

The total source and intermediate files are so BIG that I have not included them
here. I know I'm not supposed to send archives and I'm *really* not supposed to
ask you to download the test cases from the net, but in this instance I think
it's appropriate.

You can get the complete test set (1 source, 2 includes, Makefile, and saved
intermediate file) at www.OkianWarrior.com/gccBug.tar.gz

/home/kibaro/tmp: make
gcc -v -save-temps -o CSolv CSolv.c
Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/specs
Configured with: ./configure
Thread model: posix
gcc version 3.3.1
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -E -quiet -v -D__GNUC__=3
-D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=1 CSolv.c CSolv.i
ignoring nonexistent directory "NONE/include"
ignoring nonexistent directory "/usr/local/i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/include
 /usr/include
End of search list.
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3.1/cc1 -fpreprocessed CSolv.i
-quiet -dumpbase CSolv.c -auxbase CSolv -version -o CSolv.s
GNU C version 3.3.1 (i686-pc-linux-gnu)
        compiled by GNU C version 3.2 (Mandrake Linux 9.0 3.2-1mdk).
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32119
gcc: Internal error: Killed (program cc1)
Please submit a full bug report.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
make: *** [CSolv] Error 1

------- Comment #1 From Andrew Pinski 2003-09-11 15:53 -------
Please attach the preprocessed files to this bug.

------- Comment #2 From barrabas@barrabas.mv.com 2003-09-11 19:26 -------
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message
3 of 9)

> Please attach the preprocessed files to this bug.

        The file is too big to send to Bugzilla. As mentioned in the bug
posting, the 
file is available for http download here:

www.OkianWarrior.com/gccBug.tar.gz

        I've read the guidelines for posting bugs. I know I'm not supposed to
post 
archives, or links to archives, or multiple file examples. I know all this. I 
believe that this is an exception for reasons stated in the bug description, 
and I ask that you bear with me.

                                                                               
                                                R. Barrabas

==================================================

My younger brother asked me what happens after we die. I told
  him we get buried under a bunch of dirt and worms eat our bodies.
  I guess I should have told him the truth-that most of us go to hell
  and burn eternally - but I didn't want to upset him.

------- Comment #3 From barrabas@barrabas.mv.com 2003-09-11 19:28 -------
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message
3 of 9)

> ------- Additional Comments From pinskia at gcc dot gnu dot org  2003-09-11 15:53 -------
> Please attach the preprocessed files to this bug.

        I would guess that the bug is caused by:

        1) The compiler allocates lots of storage for intermediate results.
        2) Virtual memory gets used up, and the next allocations fails.
        3) The allocation is not checked, leading to eventual failure.

                                                                               
                                                R. Barrabas

==================================================

Dictatorship (n): a form of government under which everything
                  which is not prohibited is compulsory.

------- Comment #4 From Andrew Pinski 2003-09-17 04:07 -------
How much memory do you have?

------- Comment #5 From barrabas@barrabas.mv.com 2003-09-17 04:33 -------
Subject: Re:  Crashes when compiling large initialized arrays (gccBug: message
7 of 9)

> How much memory do you have?

256MB of ram + 256MB of swap.


==================================================

Those who live by the sword get shot by those who don't.

------- Comment #6 From Andrew Pinski 2003-09-17 05:21 -------
Must be a memory intensive as I cannot reproduce on a system with 1GB of
memory.

------- Comment #7 From Andrew Pinski 2003-10-01 03:33 -------
Not GCC problem that the OS returns a non-zero pointer when memory is full.

------- Comment #8 From Andrew Pinski 2003-10-01 03:35 -------
Though on the other hand GCC should not be such a hog of memory.

------- Comment #9 From Andrew Pinski 2003-10-05 06:32 -------
Note that one day the web server will be down and "we" (meaning GCC developers) 
cannot access the testcase so we will ask you for the testcase to be attached,
can you 
just attach the preprocessed source.

------- Comment #10 From Andrew Pinski 2003-11-22 20:39 -------
The testcase takes about 445M on i686-pc-linux-gnu and more than 500M on
powerpc-
apple-darwin7.0.0.  Will attach testcase.

------- Comment #11 From Andrew Pinski 2004-01-19 12:57 -------
Still is a problem on the mainline, targeting 3.5.0 for now.

------- Comment #12 From Falk Hueffner 2004-09-12 15:36 -------
The URL of the test case doesn't seem to work anymore. Does anybody still have
the test case?

------- Comment #13 From Ian Lance Taylor 2004-12-02 15:31 -------
Created an attachment (id=7660) [edit]
Simple test case

------- Comment #14 From Andrew Pinski 2004-12-02 15:42 -------
We take with the attached testcase about 300M with the C front-end but a huge
amount more for the 
C++ front-end, why?

------- Comment #15 From Ian Lance Taylor 2004-12-02 15:53 -------
I attached a simple test case.  This is based on real existing code, although I
changed all the values to hide potentially proprietary information.  When I
compile this file without optimization, it uses some 200M, and garbage collects
while compiling this file.  The compilation takes 1 minutes, 45 seconds.  (This
is much better than gcc 3.4.3, actually, which used all available memory,
garbage collected twice, and wound up swapping for 10 minutes or so before
completing).

When compiling with 2.95.3, the compiler uses 20M and completes in 37 seconds.

The compiler used to work fine when processing very large initializers.  As it
read the initializer, gcc would output the initializer to the assembler file
directly.  This capability was removed here:
    http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00933.html
The followups to that message mention this type of problem.

------- Comment #16 From Andrew Pinski 2004-12-02 16:10 -------
PR 14179 is for the C++ problem.

------- Comment #17 From Richard Henderson 2005-01-06 01:22 -------
No chance this is getting done for 4.0.

------- Comment #18 From Andrew Pinski 2005-07-22 18:36 -------
Does anyone have the current numbers for this bug?
I know for C, the memory usage has gone down but I don't know by how much.

------- Comment #19 From Andrew Pinski 2005-07-25 01:27 -------
c-typeck.c:5987 (output_init_element)                     0: 0.0%  
23955160:100.0%   22770552:20.9%   
13171408:99.1%         19
convert.c:671 (convert_to_integer)                 52184768:37.8%          0:
0.0%          0: 0.0%          0: 0.0%    
1630774
ggc-common.c:193 (ggc_calloc)                      33547596:24.3%          0:
0.0%   33577352:30.7%        
544: 0.0%         45
tree.c:828 (build_int_cst_wide)                    52176864:37.8%          0:
0.0%   52177536:47.8%          0: 
0.0%    3261075

------- Comment #20 From Andrew Pinski 2005-07-25 01:30 -------
There must be a better way to add on to celt in output_init_element.

------- Comment #21 From Richard Guenther 2005-09-12 08:55 -------
Max memory usage on (checking-disabled) mainline is now 253149kB (on a machine
with 1GB of RAM) for C and 403669kB for C++ (!)

------- Comment #22 From Richard Guenther 2005-09-12 10:03 -------
One problem is that we use integer tree nodes for counting from zero to N,
which
is just stupid and wastes RAM (because we do not collect during building the
initializer).  Of course we also store that "index" in the initializer element
list.

This whole mess asks for a (less general) rewrite.  Minimal-invasive surgery
is impossible.

------- Comment #23 From Giovanni Bajo 2005-09-12 10:08 -------
The problem is that the gimplifier always want the index field of the 
constructor element to be filled. If you fix that in the obvious way (so 
that "no index" means "previous index + 1"), it should be quite easy to fix, 
for C++. In C, I have no clue how this interacts with designated initializers 
though.

------- Comment #24 From Richard Guenther 2005-10-03 17:54 -------
It somehow works (partially), but there's a lot of fallout.  Ugh.  I don't like
it very much.  Preliminary patch:

http://gcc.gnu.org/ml/gcc-patches/2005-10/msg00091.html

------- Comment #25 From Richard Henderson 2005-10-11 19:24 -------
I don't think we can reasonably attack this for 4.1.  This is something
that should be done during a stage 1.

------- Comment #26 From Ian Lance Taylor 2005-10-11 19:30 -------
Regression bugs should have target milestones.

------- Comment #27 From niemayer@isg.de 2006-12-13 11:37 -------
I would like to mention that this problem seems to have worsened a lot for the
current snapshots of gcc-4.2 (currently testing with 4.2.0 20061205
(prerelease)) when compiling with at least -O1 - maybe due to the static
constant elimination?.

I tried to compile a Unicode normalization test C++ source that took gcc about
300MB of RAM before to compile with -O1 - now with gcc 4.2 I cannot compile
this source anymore on a machine with 1 GB of physical + 1 GB of virtual RAM
before the kernel OOM killer is killing cc1plus.

If somebody would like the source of my test-case, I can supply it.

------- Comment #28 From Mark Mitchell 2007-05-14 22:25 -------
Will not be fixed in 4.2.0; retargeting at 4.2.1.

------- Comment #29 From niemayer@isg.de 2007-05-15 16:54 -------
That's sad - while memory gets cheaper, it has still not become cheap enough to
cope with that huge increase in memory usage imposed by gcc 4.2. Seems I have
to stick with 4.1 until that problem is fixed...

------- Comment #30 From Pawel Sikora 2007-05-15 17:04 -------
looks like related to PR30052.

------- Comment #31 From Mark Mitchell 2007-10-09 19:20 -------
Change target milestone to 4.2.3, as 4.2.2 has been released.

------- Comment #32 From Richard Guenther 2008-01-05 14:07 -------
The difference between using gcc and g++ for the testcase seems to be gone on
the trunk, where gcc peaks at 480MB and g++ at 530MB.  For 4.1 g++ used 780MB.

------- Comment #33 From Joseph S. Myers 2008-01-17 15:42 -------
This memory use regression has been present since at least 3.3; at least part
of it may be an unavoidable consequence of supporting C99 overriding in
designated initializers; a proper fix would likely involve major changes to the
datastructures for initializers (as RTH notes in comment#25, it's not suitable
for a stage 3 fix); the priority seems to have been P2 from the start rather
than having been set by an RM.  In view of these (but especially the likely
unsuitability of a fix for stage 3), downgrading to P4 (the same as the
corresponding C++ bug, bug 14179).

------- Comment #34 From niemayer@isg.de 2008-01-17 17:02 -------
Can you suggest any kind of work-around? Any alternative to represent constant
arrays in C/C++?

The problem with leaving this bug open indefinitely is that there are existing
programs (as the Unicode-test-case I mentioned above) which will simply not
compile on any reasonably equipped machine anymore.

I wouldn't mind to change the source code to represent the constant arrays in a
different way, but I have not found a method yet (other than using platform
dependend methods like generating assembler source).

------- Comment #35 From Ian Lance Taylor 2008-01-18 06:37 -------
The bug should certainly be fixed.  But it's unfortunately a lot of work for a
small payoff--most people are not in your situation.  I think Joseph is correct
in lowering the priority.  It's pointless for us to describe this bug as
release-blocking, when it clearly is not.

The core problem is C99 designated initializers.  Those require us to read the
entire array into memory before we emit any of it.  Otherwise we could generate
the wrong code, and there is no way to recover.

So the only plausible fix is to optimize the memory representation used for
large array initializers.

------- Comment #36 From Joseph S. Myers 2008-02-01 16:52 -------
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.

------- Comment #37 From Joseph S. Myers 2008-05-19 20:22 -------
4.2.4 is being released, changing milestones to 4.2.5.

------- Comment #38 From Joseph S. Myers 2008-07-04 22:44 -------
Closing 4.1 branch.

------- Comment #39 From Richard Guenther 2009-02-10 10:12 -------
*** Bug 39142 has been marked as a duplicate of this bug. ***

------- Comment #40 From Jan Hubicka 2009-02-21 12:40 -------
I happen to have compiler with statistics around:
We still need about 400MB, mostly integer constants:
c-decl.c:473 (bind)                                  125040: 0.0%          0:
0.0%          0: 0.0%          0: 0.0%       2605
tree.c:5905 (build_function_type)                     13000: 0.0%          0:
0.0%     113400: 0.1%       5056: 0.0%        632
stringpool.c:73 (alloc_node)                           6032: 0.0%          0:
0.0%     174096: 0.1%      13856: 0.0%       1732
langhooks.c:543 (add_builtin_function_common)             0: 0.0%          0:
0.0%     442224: 0.2%      59760: 0.2%       1494
c-typeck.c:6472 (output_init_element)                     0: 0.0%  
47910400:100.0%   45541112:23.7%   26342936:66.6%         19
convert.c:752 (convert_to_integer)                117415728:44.6%          0:
0.0%          0: 0.0%   13046192:33.0%    1630774
ggc-common.c:187 (ggc_calloc)                      67094608:25.5%          0:
0.0%   67162736:34.9%       1088: 0.0%         58
tree.c:1004 (build_int_cst_wide)                   78264768:29.8%          0:
0.0%   78266496:40.7%          0: 0.0%    3261068
Total                                             262986355         47910416   
    192171521         39527780          4905807
source location                                     Garbage            Freed   
         Leak         Overhead            Times


It seems that we produce awful amount of garbage during the initializer
construction.  Perhaps by forcing ggc_collect there we can get down to 200MB
that we need to reprezent it at the end?

Honza

------- Comment #41 From rguenther@suse.de 2009-02-21 12:50 -------
Subject: Re:  [4.2/4.3/4.4 regression] Uses lots of memory when
 compiling large initialized arrays

On Sat, 21 Feb 2009, hubicka at gcc dot gnu dot org wrote:

> ------- Comment #40 from hubicka at gcc dot gnu dot org  2009-02-21 12:40 -------
> I happen to have compiler with statistics around:
> We still need about 400MB, mostly integer constants:
> c-decl.c:473 (bind)                                  125040: 0.0%          0:
> 0.0%          0: 0.0%          0: 0.0%       2605
> tree.c:5905 (build_function_type)                     13000: 0.0%          0:
> 0.0%     113400: 0.1%       5056: 0.0%        632
> stringpool.c:73 (alloc_node)                           6032: 0.0%          0:
> 0.0%     174096: 0.1%      13856: 0.0%       1732
> langhooks.c:543 (add_builtin_function_common)             0: 0.0%          0:
> 0.0%     442224: 0.2%      59760: 0.2%       1494
> c-typeck.c:6472 (output_init_element)                     0: 0.0%  
> 47910400:100.0%   45541112:23.7%   26342936:66.6%         19
> convert.c:752 (convert_to_integer)                117415728:44.6%          0:
> 0.0%          0: 0.0%   13046192:33.0%    1630774
> ggc-common.c:187 (ggc_calloc)                      67094608:25.5%          0:
> 0.0%   67162736:34.9%       1088: 0.0%         58
> tree.c:1004 (build_int_cst_wide)                   78264768:29.8%          0:
> 0.0%   78266496:40.7%          0: 0.0%    3261068
> Total                                             262986355         47910416   
>     192171521         39527780          4905807
> source location                                     Garbage            Freed   
>          Leak         Overhead            Times
> 
> 
> It seems that we produce awful amount of garbage during the initializer
> construction.  Perhaps by forcing ggc_collect there we can get down to 200MB
> that we need to reprezent it at the end?

We need the integer csts in the constructor lists.  I have a patch
somewhere (or is it even attached?) that tries to do index compression
and not use the integer csts for counting.  Didn't work out too much
though.

Richard.

------- Comment #42 From Jan Hubicka 2009-02-22 11:21 -------
Actual representation of constructor don't seem to be major problem here.

We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and
convert integer to proper type, so counting in INTEGER_CSTs should be just
slightly less than half of memory needed.  This seems quite silly.

The patch to not use HOST_WIDE_INT or similar for counting should save another
70MB of garbage (and speed up compilation), so perhaps you could dig it out?
:))

Following patch:
Index: convert.c
===================================================================
--- convert.c   (revision 144352)
+++ convert.c   (working copy)
@@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr
          break;
        }

+      /* When parsing long initializers, we might end up with a lot of casts.
+         Shortcut this.  */
+      if (TREE_CODE (expr) == INTEGER_CST)
+       return fold_unary (CONVERT_EXPR, type, expr);
+
       return build1 (CONVERT_EXPR, type, expr);

     case REAL_TYPE:

Cuts gabrage production in half:
c-typeck.c:6472 (output_init_element)                     0: 0.0%  
47910400:100.0%   45541112:23.7%   26342936:99.5%         19
ggc-common.c:187 (ggc_calloc)                      67094608:46.1%          0:
0.0%   67162736:34.9%       1088: 0.0%         58
tree.c:1004 (build_int_cst_wide)                   78264768:53.8%          0:
0.0%   78266496:40.7%          0: 0.0%    3261068
Total                                             145570627         47910416   
    192171521         26481588          3275033
source location                                     Garbage            Freed   
         Leak         Overhead            Times

I will give the patch testing, but I am not too hopeful it will just work. ;)

Honza

------- Comment #43 From rguenther@suse.de 2009-02-22 19:03 -------
Subject: Re:  [4.2/4.3/4.4 regression] Uses lots of memory when
 compiling large initialized arrays

On Sun, 22 Feb 2009, hubicka at gcc dot gnu dot org wrote:

> Actual representation of constructor don't seem to be major problem here.
> 
> We seem to build _a lot_ (117MB) of CONVERT exprs just to call fold on it and
> convert integer to proper type, so counting in INTEGER_CSTs should be just
> slightly less than half of memory needed.  This seems quite silly.
> 
> The patch to not use HOST_WIDE_INT or similar for counting should save another
> 70MB of garbage (and speed up compilation), so perhaps you could dig it out?
> :))
> 
> Following patch:
> Index: convert.c
> ===================================================================
> --- convert.c   (revision 144352)
> +++ convert.c   (working copy)
> @@ -749,6 +749,11 @@ convert_to_integer (tree type, tree expr
>           break;
>         }
> 
> +      /* When parsing long initializers, we might end up with a lot of casts.
> +         Shortcut this.  */
> +      if (TREE_CODE (expr) == INTEGER_CST)
> +       return fold_unary (CONVERT_EXPR, type, expr);

fold_convert ().  But maybe not valid to do here for C std reasons, who 
knows.

> +
>        return build1 (CONVERT_EXPR, type, expr);

And probably just generally using fold_convert () would be ok as well.
Maybe they are there to make sure to build rvalues.

>      case REAL_TYPE:
> 
> Cuts gabrage production in half:
> c-typeck.c:6472 (output_init_element)                     0: 0.0%  
> 47910400:100.0%   45541112:23.7%   26342936:99.5%         19
> ggc-common.c:187 (ggc_calloc)                      67094608:46.1%          0:
> 0.0%   67162736:34.9%       1088: 0.0%         58
> tree.c:1004 (build_int_cst_wide)                   78264768:53.8%          0:
> 0.0%   78266496:40.7%          0: 0.0%    3261068
> Total                                             145570627         47910416   
>     192171521         26481588          3275033
> source location                                     Garbage            Freed   
>          Leak         Overhead            Times
> 

------- Comment #44 From Jan Hubicka 2009-02-23 13:41 -------
Hi,
I believe that using fold_convert instead of fold_build1 means that we would
bypass folding done in fold_unary that handles stuff like two conversions in a
row while fold_convert is primarily about returning constant when result is
constant.

Since I want to avoid wrapping fold calls all frontends except for C++
consistently put around convert_to_* calls, I want to do this kind of folding.

I believe only reason to avoid folding is C++ template stuff.

------- Comment #45 From Jan Hubicka 2009-02-23 16:46 -------
Subject: Bug 12245

Author: hubicka
Date: Mon Feb 23 16:46:32 2009
New Revision: 144384

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144384
Log:
        PR c/12245
        * ggc.h (htab_create_ggc): Use ggc_free to free hashtable when
resizing.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ggc.h

------- Comment #46 From Joseph S. Myers 2009-03-31 16:13 -------
Closing 4.2 branch.

------- Comment #47 From Richard Guenther 2009-08-04 12:25 -------
GCC 4.3.4 is being released, adjusting target milestone.

Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug