This is the mail archive of the
mailing list for the GCC project.
gcse, store motion and loop optimizer
- To: gcc at gcc dot gnu dot org, dan at cgsoftware dot com, timothy dot c dot prince at intel dot com
- Subject: gcse, store motion and loop optimizer
- From: Jan Hubicka <jh at suse dot cz>
- Date: Fri, 3 Aug 2001 09:58:27 +0200
I am just tracking down the failure in twolf spec2000 benchmark. What
happends there is basic-style programming, where all variables are global,
and of course loop induction variable is too.
Originally we did handled this in loop optimizer, by hoisting (We should
probably kill that code, but at least in twolf it still hoists, I willl
re-check one Dan's store motion work is in) and optimized the loop
in de-facto same was as if it were static.
Now we do have problem. The induction variable is killed by gcse in two
steps, first it is PREed, then it is store motioned to sotre the final
value, resulting in code, that use one regiester for iteration and
always store the value to other register for storing the result.
CSE is not able to unify these, as her local point of view suggest that
it is better to use first value for iterating (it don't know it is the
induction variable) and then the second variable for storing (as the lifetime
is longer and CSE decides it is superrior).
This is unfortunate, as the strength reduction than goes crazy, creating
redundant induction variable with value i+1 and doing other crazy think
confusing itself in the next loop iteration (by inserting instructions
in the BIV computation where it don't expect) failing to unroll and producing
ugly code. This is basically because it sees computation of x=induction+1
separately and don't know, that it is just computation of the
induction variable and that at the end there is induction=x
(after several uses x and making several copies of it).
Strength reduction has logic to get around some damage, but IMO it should
be able to expect that:
1) there are no dead computations of general induction variables
2) there are no unneeded copies
as assuring those would require to do similar analyzis as gcse does.
So my question is, would be possible to fix gcse pass to cleanup the
damage? It can be done by eighter making store motion curefull and reuse
the value in register if it reaches, or run the copy propagation after
it together with killing dead copy instructions (as subsequent CSE pass
undoes copy prpagation partially currently). This would require CPROP
to do the transformations even locally on the beggining and ends of basic
blocks. I've experimented with using something like "REG_DEPRECATED"
note to control CSE decisions, but I think it would be cleaner to avoid
gcse from being stricly global, as the GCSE itself isn't, just CPROP is.
OK now I will continue to figure out what exactly goes wrong in the second
Tim: May this be the case in your programs?