This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] Speeding things up
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: law at redhat dot com
- Cc: Diego Novillo <dnovillo at redhat dot com>, "" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 29 Jan 2003 19:43:47 -0500 (EST)
- Subject: Re: [tree-ssa] Speeding things up
- References: <200301300014.h0U0ErYv022267@localhost.redhat.com>
>
> Right now I'm working on addressing some severe performance
> issues with the tree-ssa branch. The issue in my gunsight right
> now is the building of our may alias information. For the testcase
> I've got 20001226-1.c computing may alias takes about 30 minutes.
>
> Fixing some algorithmic issues gets that down to around 13 minutes.
>
> However, even after fixing the key algorithmic issues, we're still doing
> 2 billion silly calls that are totally unnecessary and account
> for the top two and 5 of the top 10 cpu hogs. You might think inlining
> them is the answer. Reality couldn't be further from the truth.
Here, you've got this many calls because you haven't actually fixed the
algorithmic issues. You just think you have, apparently.
Compute may aliases is *always* O(N^2) in the current form (and this is
the real cause of all the calls below) , because it's algorithmically bad.
It currently does:
For every referenced variable a
For every referenced variable b
see if a aliases b
where "see if a aliases b" is what calls the functions below.
For 27,000 variables, thats 27000^2 = ~780 million calls.
For the subset of about 24000 variables we couldn't directly prove, we
call get_alias_set 24000^2 = 580 million times.
etc.
Nothing you do will make it any less O(N^2) as long as this is the main
loop.
It needs to walk points to sets PTA computes for each variable, not every
referenced variable for each variable.
it should be
For every referenced variable
For each variable in the points-to set
See if we can prove they don't alias through TBAA or
another method.
otherwise, they alias
Since PTA computes a set of, on average, 2.6 variables a given variable
could point to, that would reduce the number of calls to about 80 or 90
thousand in each case.
Doing it this way will *never* be worse than the existing way, since
the largest set PTA can come up with is all variables.
However, it will be O(2.6 * N) in the average case, rather than O(N^2).
Also, this has nothing to do with inlining, since you *haven't* fixed
the algorithmic problem.
Once you've done that, if the overhead of calling is still a lot, then
inlining makes sense.
By the by, if you had mentioned this problem on the public mailing list,
you probably would have gotten a lot more help a lot quicker.
I've mentioned this problem to Diego before, as well.
>
> get_base_symbol 780 million calls
> Recursive function with a switch statement.
>
> get_alias_set 520 million calls
> Big and ugly. Has recursive calls and callbacks
> via function pointers into the front-ends
>
> find_base_decl 520 million calls
> Recursive function with a switch statement
>
> c_common_get_alias_set 520 million calls
> Called indirectly via get_alias_set
>
> alias_sets_conflict_p 260 million calls
> Not terribly bad, but more code than IMHO we should force inlined.
>
>
> In all we're talking about over 2.6 billion function calls (and this is
> *after* the algorithmic improvements). For reference we really
> need about 100k calls total.
>