Bug 30089 - [4.3 Regression] Compiling FreeFem3d uses unreasonable amount of time and memory
Summary: [4.3 Regression] Compiling FreeFem3d uses unreasonable amount of time and memory
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.3.0
Assignee: Daniel Berlin
URL:
Keywords: compile-time-hog, memory-hog
Depends on:
Blocks:
 
Reported: 2006-12-06 17:15 UTC by Richard Biener
Modified: 2007-03-31 14:43 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-12-13 17:32:28


Attachments
first testcase (49.64 KB, application/octet-stream)
2006-12-06 17:17 UTC, Richard Biener
Details
second testcase (56.00 KB, application/octet-stream)
2006-12-06 17:18 UTC, Richard Biener
Details
operandsbitmaps.diff (7.34 KB, text/x-diff)
2007-01-09 21:42 UTC, Daniel Berlin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2006-12-06 17:15:09 UTC
Two testcases will follow, build with -O2.
Comment 1 Richard Biener 2006-12-06 17:17:31 UTC
Created attachment 12758 [details]
first testcase
Comment 2 Richard Biener 2006-12-06 17:18:50 UTC
Created attachment 12759 [details]
second testcase
Comment 3 Richard Biener 2006-12-06 17:20:01 UTC
Before the last big regressions on the mainline the first one took 350MB and 52s to build with -O2 on x86_64, the second one 685MB and 147s.  That was g++ (GCC) 4.3.0 20061122 (experimental).
Comment 4 Daniel Berlin 2006-12-07 04:48:34 UTC
On my machine, with an unoptimized cc1plus (IE stage1), the first one, at -O2 takes 150meg of memory total, and 221 seconds, with most of the time being verifiers.
This is with local PTA changes to speed up PTA


 TOTAL                 : 221.58            17.76           271.63             

note:
 tree SSA verifier     :  51.90 (23%) usr   0.89 ( 5%) sys  56.19 (21%) wall      74 kB ( 0%) ggc
 tree STMT verifier    :  40.15 (18%) usr   0.88 ( 5%) sys  41.12 (15%) wall 

etc

The *other* case is spending all it's alias time checking for dups in add_may_alias, which diego's patch should fix. There is also a ton of time elsewhere:
tree alias analysis   :  71.28 (21%) usr   1.66 ( 9%) sys 114.75 (26%) wall   18776 kB ( 5%) ggc
 tree SSA verifier     :  33.40 (10%) usr   0.43 ( 2%) sys  34.55 ( 8%) wall     259 kB ( 0%) ggc
 tree STMT verifier    :  30.43 ( 9%) usr   0.64 ( 3%) sys  31.04 ( 7%) wall       0 kB ( 0%) ggc
PRE                   :  64.49 (19%) usr   0.64 ( 3%) sys  75.84 (17%) wall    1086 kB ( 0%) ggc
 scheduling 2          :  46.21 (14%) usr   0.35 ( 2%) sys  62.71 (14%) wall    2328 kB ( 1%) ggc
TOTAL                 : 339.09            19.05           444.66      

(it was in a debugger for about 100s to get some idea of what was going on).


But on my machine, it still only uses 350 meg of memory

On x86_64, i expect about double memory usage.

I also expect if i tested bootstrapped optimized compilers, i'd get the same times you are expecting, excluding checking time

This leaves a few possibilities:
1 My local PTA improvements are helping this
2 Something is very different on x86_64
3 My preprocessing using Apple G++ 4.0.1 is giving very different code to play with than mainline does
4 The regression is already fixed :)


I can either send you the patch to test for 1, or you can wait a few days for me to commit it
Comment 5 Richard Biener 2006-12-07 12:40:49 UTC
Numbers with mainline r119612 are

FiniteElementMethod.cpp: 346MB, 46.86s
parse.ff.cc: 1GB, 1236.53s

so actually the latter is the biggest offender here ;)  The compiler was built
with release checking (but not bootstrapped, built with gcc 4.1.2).  Flags
for building parse.ff.cc are -O2 -fno-strict-aliasing (in case this makes
a difference).
Comment 6 Richard Biener 2006-12-07 14:07:15 UTC
 tree alias analysis   :1125.41 (91%) usr   1.57 (31%) sys1127.32 (91%) wall  199468 kB (19%) ggc
 PRE                   :  61.16 ( 5%) usr   0.83 (16%) sys  62.01 ( 5%) wall    2073 kB ( 0%) ggc
 TOTAL                 :1239.40             5.05          1244.82            1038165 kB

So, easy whom to blame ;)
Comment 7 Richard Biener 2006-12-12 16:44:40 UTC
We're now ICEing in

internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:365

for the second testcase.
Comment 8 Diego Novillo 2006-12-13 17:32:28 UTC
http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00959.html fixes the ICE in the operand scanner.

The alias times should be back to saner values, but the memory consumption problem is still there.  Still looking into that.
Comment 9 Diego Novillo 2006-12-13 23:50:16 UTC
The memory problem is quite simple: We just have a *lot* of pointers and a *lot* of addressable symbols.  Here is a breakdown of what happens on the first call to compute_may_aliases:

During the first call to compute_may_aliases:

1- Size of cc1plus is 339Mb
2- Call to compute_points_to_sets grows to 355Mb (+4.72%)
3- Call to compute_flow_insensitive_aliasing grows to 364Mb (+2.54%)
4- Call to compute_flow_sensitive_aliasing grows to 667Mb (+83.2%)

The reason for this tremendous growth is quite simple.  There are 39,010 SSA name pointers and many of them have their own points-to set, which we store in that name's may-alias set.  We grow to 667Mb in the last loop of compute_flow_sensitive_aliasing.

One thing we could do is just not use flow-sensitive information in these cases.  If we don't set SSA_NAME_PTR_INFO, everything will default to flow-insensitive information and things will Just Work.

Perhaps using sparse bitmaps for the may-alias sets might help with memory consumption, but I found these bitmaps to slow down the operand scanner quite a bit in the past.  May be worth a try.

Danny, thoughts?
Comment 10 Diego Novillo 2006-12-13 23:54:45 UTC
(In reply to comment #9)
> The memory problem is quite simple: We just have a *lot* of pointers and a
> *lot* of addressable symbols.  Here is a breakdown of what happens on the first
> call to compute_may_aliases:
> 
> During the first call to compute_may_aliases:
> 
This is in the function ffparse, BTW.  Which has alias sets with 4.2 million elements.
Comment 11 Daniel Berlin 2006-12-14 01:11:09 UTC
Subject: Re:  Compiling FreeFem3d uses unreasonable amount of time and memory

On 13 Dec 2006 23:50:17 -0000, dnovillo at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #9 from dnovillo at gcc dot gnu dot org  2006-12-13 23:50 -------
>
> The memory problem is quite simple: We just have a *lot* of pointers and a
> *lot* of addressable symbols.  Here is a breakdown of what happens on the first
> call to compute_may_aliases:
>
> During the first call to compute_may_aliases:
>
> 1- Size of cc1plus is 339Mb
> 2- Call to compute_points_to_sets grows to 355Mb (+4.72%)
> 3- Call to compute_flow_insensitive_aliasing grows to 364Mb (+2.54%)
> 4- Call to compute_flow_sensitive_aliasing grows to 667Mb (+83.2%)
>
> The reason for this tremendous growth is quite simple.  There are 39,010 SSA
> name pointers and many of them have their own points-to set, which we store in
> that name's may-alias set.

If they had their own set, then compute_points_to_set would have
required more memory.

>We grow to 667Mb in the last loop of
> compute_flow_sensitive_aliasing.
>
> One thing we could do is just not use flow-sensitive information in these
> cases.  If we don't set SSA_NAME_PTR_INFO, everything will default to
> flow-insensitive information and things will Just Work.
>

> Perhaps using sparse bitmaps for the may-alias sets might help with memory
> consumption, but I found these bitmaps to slow down the operand scanner quite a
> bit in the past.  May be worth a try.
>
> Danny, thoughts?

If compute_points_to_sets only grows memory by 26 meg, then that's all
we need to represent the points-to sets of all these variables

So we shoudl be able to do this without using more memory than that.
Sadly, we create copies of  the result of find_what_p_points_to, then
transform it into an array.

I'll give the bitmaps a try for the operand scanner and see how it works.

I really don't think we should have to turn off information that is
only taking 26 meg to store originally :)
Comment 12 Diego Novillo 2006-12-14 02:50:04 UTC
(In reply to comment #11)

> I'll give the bitmaps a try for the operand scanner and see how it works.
> 
OK.  Hopefully that won't introduce a huge slowdown in the operand scanner.  Assigning back to you.
Comment 13 Jan Hubicka 2006-12-23 14:26:00 UTC
Note that we've got another noticeable jump in memory consumption today (well at least it would be very important jump if we used just 28MB of memory for aliasing :).  Is that also aliasing or shall be analyzed?

Honza
Comment 14 Jan Hubicka 2006-12-23 14:27:39 UTC
Well, actually the testcase now runs out of memory and ICE...
Comment 15 Daniel Berlin 2006-12-23 16:21:21 UTC
Subject: Re:  Compiling FreeFem3d uses unreasonable amount of time and memory

On 23 Dec 2006 14:26:00 -0000, hubicka at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #13 from hubicka at gcc dot gnu dot org  2006-12-23 14:26 -------
> Note that we've got another noticeable jump in memory consumption today (well
> at least it would be very important jump if we used just 28MB of memory for
> aliasing :).  Is that also aliasing or shall be analyzed?

It's possible it was my aliasing fix, but it's hard to say. I had only
seen cases where it increased mem usage ~3-5% (this was on large
testcases too).
It certainly shouldn't run out of memory.

In any case, i'm working on testing bitmaps for may-aliases right now.
Comment 16 Richard Biener 2007-01-09 21:17:04 UTC
Pling!
Comment 17 Daniel Berlin 2007-01-09 21:42:03 UTC
Subject: Re:  Compiling FreeFem3d uses unreasonable amount of time and memory

Try the attached, let me know how it goes.



On 9 Jan 2007 21:17:05 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #16 from rguenth at gcc dot gnu dot org  2007-01-09 21:17 -------
> Pling!
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30089
>
>
Comment 18 Daniel Berlin 2007-01-09 21:42:03 UTC
Created attachment 12874 [details]
operandsbitmaps.diff
Comment 19 Richard Biener 2007-01-10 18:39:02 UTC
We'll see with tonights run of the tester.  Thanks.
Comment 20 Richard Biener 2007-01-11 18:12:03 UTC
Again tonight - Mark broke bootstrap.
Comment 21 Richard Biener 2007-01-13 23:02:11 UTC
The patch fixed the freefem memory regression.
Comment 22 Daniel Berlin 2007-01-14 01:22:42 UTC
Subject: Re:  Compiling FreeFem3d uses unreasonable amount of time and memory

okay, i'll update changelog, submit and commit.

On 13 Jan 2007 23:02:13 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #21 from rguenth at gcc dot gnu dot org  2007-01-13 23:02 -------
> The patch fixed the freefem memory regression.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30089
>
>
Comment 23 Richard Biener 2007-01-14 15:06:37 UTC
Can it be the patch changes result of alias analysis?  I get (big) runtime differences (but maybe due to unrelated changes) with the tester from Jan 13 (patched) vs. Jan 14 (unpatched).
Comment 24 Daniel Berlin 2007-03-30 16:58:59 UTC
This is fixed now, rihgt?
I forgot to add the PR number to the changelog.
Comment 25 Richard Biener 2007-03-31 14:42:39 UTC
Yes.