This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [patch] lcm,gcse changes for nearly full flow graphs



  In message <Pine.GSO.4.21.0006240130200.20838-100000@platon>you write:
  > Hi,
  > 
  > some weeks ago I (and Brad) profiled the compiler a little bit on some of
  > Brads testcases, and one of the remaining (after elimination of the 
  > problem with dominators) biggest time sinks was gcse and its various data
  > collecting passes. The patch below addresses some of them to reduce the
  > time to a more normal value.
  > 
  > First it changes gcse to insert newly found register setters in front of
  > the already found. Thats no problem as later the whole list is anyway
  > handled as set, and makes that insertion O(n) instead O(n^2).
  > 
  > Second it changes the worklists of nodes from a stack to a queue, and
  > thereby reduces the number of calls to sbitmap_intersection_of_preds() and
  > friends.
  > 
  > Now there is one problem left (pre_expr_reaches_here_p_work) which calls
  > itself 89169411 times while called from outside only 48788 times (for
  > Brads testcase all.i). The fix here would be a little bit more involved.
  > 
  > All in all these changes make a compile on all.i went down from 878 to 396
  > seconds (on Brads machine, with an unprofiled cc1 I believe). The compiler
  > bootstraps with them on alpha and i386 (not slower as without the
  > changes).
  > 
  > Btw: I sent in my assignment some weeks ago. Is it still not there?
  > 
  > Ciao,
  > Michael.
  > --
  > 
  > 	* gcse.c (record_one_set): Prepend instead of append onto
  > 	reg_set_table, making it O(n) instead O(n^2).
  > 	* lcm.c (compute_antinout_edge,compute_laterin,compute_available):
  > 	Use a queue instead of a stack as worklist.
Your assignment has finally been recorded by the FSF.

FWIW, the changes have just a slight positive affect on compile times for
gcc itself (less than one percent, possibly just noise).

Since they don't seem to adversely affect GCC's performance and they
help Brad's code, I went ahead and installed them after fixing some
minor formatting problems (remember, whitespace around operators like
"=", "+", "-", etc).

For the reachability issue, there's ways to handle that which may
(or may not) help.  For example, for each block we could compute the
bitmap of reachable blocks -- that keeps us from doing the path
computation work over and over again in expr_reaches_here_p at the
expense of using more memory.  Another might be to eliminate the tail
recursion using a goto.


Thanks,

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]