[patch] tree-optimization/21430 operand cache slowness

Thu Sep 22 20:29:00 GMT 2005

Here's my proposed patch for this bug.  The context and problem
explanation can be found at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21430

There were 2 solutions I was looking at, (1) adding a flag to passes
which directly manipulate trees thereby bypassing the operand cache or
(2) adding a pointer in the use structure to it's list owner to quickly
check if the tree has been manipulated.

(1) looked promising, but ends up being a bit of a can of worms. The
propagate engine can manipulate the trees directly, and so can calls to
fold_stmt(). If you also take into account passes which are affected by
those two factors, you end up with most passes being flagged.  On top of
that, there is the maintenance of identifying and maintaining a flag on
passes which manipulate the underlying trees. This also represents
potential lurking bugs.

That leaves (2) for right now, at the cost of an additional pointer in
the immediate use structure.  Here are the run samples I did:

for the testcase in this bug report:

			Current		Patched
Fast x86-64                                                                                
operand scan time	27.16 sec	0.07 sec
total time		29.22 sec	2.09 sec

Older x86-64
operand scan time	41.60 sec	0.20 sec
total time		47.63 sec	5.77 sec

3.0 Ghz P4
operand scan time	19.35 sec	0.14 sec
total time		44.22 sec	25.19 sec
       (for what its worth, we spend 22 seconds in combine)

Compiling all the cc1-i files:

Fast x86-64                                                                            
operand scan time	9.26 secs	6.69 secs
tree optimizer time	80.9 secs	79.02 secs
total time		287.85 secs	285.97 secs

Older x86-64
operand scan time	24.33 secs	18.66 secs
tree optimizer time	195.77 secs	193.33 secs
total time		729.6 secs	735.81 secs

3.0 Ghz P4
operand scan time	6.78 secs	5.66 secs
tree optimizer time	60.57 secs	60.59 secs
total time		207.49 secs	209.68 secs

It certainly fixes up the pathological cases, and doesn't appear to have
a lot of negative effect. There is an increase in memory usage, but for
the most part that seems to be offset by the increase in speed of the
operand cache.  The tree optimizers tend to run as fast, or faster,
although the total time is sometimes a hair slower. As you can see, the
patch itself is pretty simple.

So anyone who has issues with compilation speed and any of the
finalize_* routines, give this patch a try. I am currently running
testsuites and bootstraps on the patch. Let me know whether this patch
appears to solve everyones problem, or whether the increase in memory
usage by the cache causes another problem somewhere.

Andrew

PS.  I haven't completely given up on solution (1), I am still looking
into alternative ways of pulling this off. This patch can be backed out
if I manage to come up with something else that is workable.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ptr.diff
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050922/471f0a36/attachment.ksh>