[patch] tree-optimization/21430 operand cache slowness
Andrew MacLeod
amacleod@redhat.com
Thu Sep 22 20:29:00 GMT 2005
Here's my proposed patch for this bug. The context and problem
explanation can be found at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21430
There were 2 solutions I was looking at, (1) adding a flag to passes
which directly manipulate trees thereby bypassing the operand cache or
(2) adding a pointer in the use structure to it's list owner to quickly
check if the tree has been manipulated.
(1) looked promising, but ends up being a bit of a can of worms. The
propagate engine can manipulate the trees directly, and so can calls to
fold_stmt(). If you also take into account passes which are affected by
those two factors, you end up with most passes being flagged. On top of
that, there is the maintenance of identifying and maintaining a flag on
passes which manipulate the underlying trees. This also represents
potential lurking bugs.
That leaves (2) for right now, at the cost of an additional pointer in
the immediate use structure. Here are the run samples I did:
for the testcase in this bug report:
Current Patched
Fast x86-64
operand scan time 27.16 sec 0.07 sec
total time 29.22 sec 2.09 sec
Older x86-64
operand scan time 41.60 sec 0.20 sec
total time 47.63 sec 5.77 sec
3.0 Ghz P4
operand scan time 19.35 sec 0.14 sec
total time 44.22 sec 25.19 sec
(for what its worth, we spend 22 seconds in combine)
Compiling all the cc1-i files:
Fast x86-64
operand scan time 9.26 secs 6.69 secs
tree optimizer time 80.9 secs 79.02 secs
total time 287.85 secs 285.97 secs
Older x86-64
operand scan time 24.33 secs 18.66 secs
tree optimizer time 195.77 secs 193.33 secs
total time 729.6 secs 735.81 secs
3.0 Ghz P4
operand scan time 6.78 secs 5.66 secs
tree optimizer time 60.57 secs 60.59 secs
total time 207.49 secs 209.68 secs
It certainly fixes up the pathological cases, and doesn't appear to have
a lot of negative effect. There is an increase in memory usage, but for
the most part that seems to be offset by the increase in speed of the
operand cache. The tree optimizers tend to run as fast, or faster,
although the total time is sometimes a hair slower. As you can see, the
patch itself is pretty simple.
So anyone who has issues with compilation speed and any of the
finalize_* routines, give this patch a try. I am currently running
testsuites and bootstraps on the patch. Let me know whether this patch
appears to solve everyones problem, or whether the increase in memory
usage by the cache causes another problem somewhere.
Andrew
PS. I haven't completely given up on solution (1), I am still looking
into alternative ways of pulling this off. This patch can be backed out
if I manage to come up with something else that is workable.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ptr.diff
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050922/471f0a36/attachment.ksh>
More information about the Gcc-patches
mailing list