Summary: | [4.2 Regression] memory hog in solve_graph | ||
---|---|---|---|
Product: | gcc | Reporter: | Pascal "Pixel" Rigaux <pixel> |
Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | dberlin, fang, gcc-bugs, rguenth |
Priority: | P2 | Keywords: | alias, memory-hog |
Version: | 4.2.1 | ||
Target Milestone: | 4.3.0 | ||
Host: | i586-mandriva-linux-gnu | Target: | i586-mandriva-linux-gnu |
Build: | i586-mandriva-linux-gnu | Known to work: | 4.1.2 4.3.0 |
Known to fail: | 4.2.1 4.2.3 4.2.4 | Last reconfirmed: | 2007-07-16 09:28:11 |
Attachments: | memory hog test case |
Description
Pascal "Pixel" Rigaux
2007-07-10 18:11:39 UTC
Created attachment 13882 [details]
memory hog test case
What exact version of 4.2.1 are you using? tested with rc1 and svn i forgot to say it doesn't occur without -O, and occurs with -O, -O2 /usr/lib/gcc/i586-mandriva-linux-gnu/4.2.1/cc1 -O fail.c _create Analyzing compilation unitPerforming interprocedural optimizations Assembling functions: _create Execution times (seconds) callgraph construction: 0.14 ( 1%) usr 0.01 ( 0%) sys 0.14 ( 0%) wall 0 kB ( 0%) ggc callgraph optimization: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc ipa reference : 0.07 ( 0%) usr 0.02 ( 0%) sys 0.10 ( 0%) wall 428 kB ( 1%) ggc preprocessing : 0.27 ( 1%) usr 0.23 ( 2%) sys 0.68 ( 2%) wall 8293 kB (25%) ggc lexical analysis : 0.13 ( 1%) usr 0.60 ( 4%) sys 0.84 ( 3%) wall 0 kB ( 0%) ggc parser : 0.64 ( 3%) usr 0.43 ( 3%) sys 0.84 ( 3%) wall 18586 kB (57%) ggc tree find ref. vars : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1905 kB ( 6%) ggc tree PTA : 15.92 (86%) usr 10.73 (72%) sys 26.67 (80%) wall 5 kB ( 0%) ggc tree alias analysis : 0.91 ( 5%) usr 2.77 (19%) sys 3.65 (11%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc expand : 0.10 ( 1%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 7 kB ( 0%) ggc varconst : 0.05 ( 0%) usr 0.02 ( 0%) sys 0.01 ( 0%) wall 643 kB ( 2%) ggc global alloc : 0.00 ( 0%) usr 0.01 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc symout : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 18.57 14.84 33.41 32596 kB ps: the verbose output is a little garbled, this trivial patch on branches/gcc-4_2-branch fixes it: --- gcc/cgraphunit.c (revision 126511) +++ gcc/cgraphunit.c (working copy) @@ -1544,7 +1544,7 @@ timevar_push (TV_CGRAPHOPT); if (!quiet_flag) - fprintf (stderr, "Performing interprocedural optimizations\n"); + fprintf (stderr, "\nPerforming interprocedural optimizations\n"); cgraph_function_and_variable_visibility (); if (cgraph_dump_file) /tmp> ~/bin/maxmem2.sh gcc-4.1 -S -O2 -o /dev/null fail.c total: 96228 kB /tmp> ~/bin/maxmem2.sh gcc-4.2 -S -O2 -o /dev/null fail.c total: 1579668 kB trunk: /tmp> ~/bin/maxmem2.sh /space/rguenther/tramp3d/install/bin/gcc -S -O2 -o /dev/null fail.c total: 109731 kB meh. Danny, do you remember which change on the trunk could have improved this? Maybe http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=122741? Backporting the shared bitmap changes brings us back to 76MB max. memory usage for 4.2. I'll bootstrap & test this. Didn't you commit the shared bitmap fix? Subject: Bug 32723 Author: rguenth Date: Tue Jul 24 07:30:47 2007 New Revision: 126867 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126867 Log: 2007-07-24 Richard Guenther <rguenther@suse.de> PR tree-optimization/32723 Backport from mainline: 2007-03-09 Daniel Berlin <dberlin@dberlin.org> * tree-ssa-structalias.c (shared_bitmap_info_t): New structure. (shared_bitmap_table): New variable. (shared_bitmap_hash): New function. (shared_bitmap_eq): Ditto (shared_bitmap_lookup): Ditto. (shared_bitmap_add): Ditto. (find_what_p_points_to): Rewrite to use shared bitmap hashtable. (init_alias_vars): Init shared bitmap hashtable. (delete_points_to_sets): Delete shared bitmap hashtable. Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/tree-ssa-structalias.c Fixed. are you sure it fixes it? it still takes 1G here... Uh, it doesn't take 1 gig on either 4.2 or 4.3 i do know it works nicely with gcc 4.3 but i still get the "memory hog" behaviour using branches/gcc-4_2-branch, ie: % /usr/lib/gcc/i586-mandriva-linux-gnu/4.2.1/cc1 -O2 fail.c runs with memory RSS raising up to 1G many times. i've also tried with gcc-4.2-4.2.1-4 from debian (which has a SVN snapshot from 20070812): % ulimit -v 900000 % /usr/lib/gcc/i486-linux-gnu/4.2.1/cc1 fail.c -O2 _create Analyzing compilation unitPerforming interprocedural optimizations Assembling functions: _create cc1: out of memory allocating 4064 bytes after a total of 877277184 bytes Do we have any way to work out whether this is still a problem? Richard seems to think the bug has been fixed, but Pascal is still seeing the problem, apparently. Change target milestone to 4.2.3, as 4.2.2 has been released. The memory is temporarily needed now by solve_graph(), because the graph has 48902 nodes. On the mainline we have only 3 constraints while for 4.2 we have thousands: ANYTHING = &ANYTHING READONLY = &ANYTHING INTEGER = &ANYTHING ESCAPED_VARS = *ESCAPED_VARS NONLOCAL.6 = ESCAPED_VARS ESCAPED_VARS = &NONLOCAL.6 ESCAPED_VARS = &NONLOCAL.6 infos = ESCAPED_VARS c_20089 = ESCAPED_VARS ESCAPED_VARS = &c_20089 c_20089 = &ANYTHING c_20089 = &ANYTHING ESCAPED_VARS = &c_20089.val c_20089.val = ESCAPED_VARS infos = &c_20089 infos = &c_20089.val c_200A2 = ESCAPED_VARS ESCAPED_VARS = &c_200A2 ... the mainline looks like: ANYTHING = { ANYTHING } READONLY = { ANYTHING } INTEGER = { ANYTHING } D.28988 = same as infos D.28988.c = same as infos D.28988.b = same as infos infos = { ANYTHING } The shared bitmap stuff was not dominant for this testcase. Still I doubt we can backport all of the solver changes. Also quite possibly 4.3 benefits from early optimizations simplifying the problem to solve. Subject: Re: [4.2 Regression] memory hog in solve_graph On 31 Oct 2007 13:07:57 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #15 from rguenth at gcc dot gnu dot org 2007-10-31 13:07 ------- > The memory is temporarily needed now by solve_graph(), because the graph has > 48902 nodes. 48902 nodes is not a lot for the solver, to be honest. > On the mainline we have only 3 constraints while for 4.2 we have > thousands: > > ANYTHING = &ANYTHING > READONLY = &ANYTHING > INTEGER = &ANYTHING > ESCAPED_VARS = *ESCAPED_VARS > NONLOCAL.6 = ESCAPED_VARS > ESCAPED_VARS = &NONLOCAL.6 > ESCAPED_VARS = &NONLOCAL.6 > infos = ESCAPED_VARS > c_20089 = ESCAPED_VARS > ESCAPED_VARS = &c_20089 > c_20089 = &ANYTHING > c_20089 = &ANYTHING > ESCAPED_VARS = &c_20089.val > c_20089.val = ESCAPED_VARS > infos = &c_20089 > infos = &c_20089.val > c_200A2 = ESCAPED_VARS > ESCAPED_VARS = &c_200A2 > ... > > the mainline looks like: > > ANYTHING = { ANYTHING } > READONLY = { ANYTHING } > INTEGER = { ANYTHING } > D.28988 = same as infos > D.28988.c = same as infos > D.28988.b = same as infos > infos = { ANYTHING } This is because we compute call clobbering differently for mainline now. The thing you'd want to add to 4.2 would be location equivalence optimization, which i never finished for either 4.2 or 4.3 (4.3 has code to compute it, but we don't substitute the variables). Location equivalence would turn the escaped_vars set into 1 variable during propagation, and then expand it back out at the end. 4.2.3 is being released now, changing milestones of open bugs to 4.2.4. This will not be fixed on the 4.2 branch. Closing as fixed in 4.3.0. |