This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] lcm,gcse changes for nearly full flow graphs
- To: Michael Matz <matzmich at cs dot tu-berlin dot de>
- Subject: Re: [patch] lcm,gcse changes for nearly full flow graphs
- From: Jeffrey A Law <law at cygnus dot com>
- Date: Thu, 20 Jul 2000 18:05:37 -0600
- cc: gcc-patches at gcc dot gnu dot org
- Reply-To: law at cygnus dot com
In message <Pine.GSO.4.21.0006240130200.20838-100000@platon>you write:
> Hi,
>
> some weeks ago I (and Brad) profiled the compiler a little bit on some of
> Brads testcases, and one of the remaining (after elimination of the
> problem with dominators) biggest time sinks was gcse and its various data
> collecting passes. The patch below addresses some of them to reduce the
> time to a more normal value.
>
> First it changes gcse to insert newly found register setters in front of
> the already found. Thats no problem as later the whole list is anyway
> handled as set, and makes that insertion O(n) instead O(n^2).
>
> Second it changes the worklists of nodes from a stack to a queue, and
> thereby reduces the number of calls to sbitmap_intersection_of_preds() and
> friends.
>
> Now there is one problem left (pre_expr_reaches_here_p_work) which calls
> itself 89169411 times while called from outside only 48788 times (for
> Brads testcase all.i). The fix here would be a little bit more involved.
>
> All in all these changes make a compile on all.i went down from 878 to 396
> seconds (on Brads machine, with an unprofiled cc1 I believe). The compiler
> bootstraps with them on alpha and i386 (not slower as without the
> changes).
>
> Btw: I sent in my assignment some weeks ago. Is it still not there?
>
> Ciao,
> Michael.
> --
>
> * gcse.c (record_one_set): Prepend instead of append onto
> reg_set_table, making it O(n) instead O(n^2).
> * lcm.c (compute_antinout_edge,compute_laterin,compute_available):
> Use a queue instead of a stack as worklist.
Your assignment has finally been recorded by the FSF.
FWIW, the changes have just a slight positive affect on compile times for
gcc itself (less than one percent, possibly just noise).
Since they don't seem to adversely affect GCC's performance and they
help Brad's code, I went ahead and installed them after fixing some
minor formatting problems (remember, whitespace around operators like
"=", "+", "-", etc).
For the reachability issue, there's ways to handle that which may
(or may not) help. For example, for each block we could compute the
bitmap of reachable blocks -- that keeps us from doing the path
computation work over and over again in expr_reaches_here_p at the
expense of using more memory. Another might be to eliminate the tail
recursion using a goto.
Thanks,
jeff