On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote:
On 20/06/14 21:28, Richard Henderson wrote:
There aren't too many users of the cmpelim pass, and previously they were all
small embedded targets without an FPU.
I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
that target is not so limited as the block comment for the pass describes.
Honestly, whatever is being deleted here ought to have been found earlier,
either via combine or cse. We ought to find out why any changes are made
during this pass for aarch64.
Agreed - Going back and looking at my notes I remember seeing a difference in
code generation with the elimination of a number of compares that prompted me
to turn this on in a number of benchmarks. I don't remember double checking why
CSE hadn't removed that at that time. This also probably explains the
equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences.
Investigating this pass for Thumb1 may be interesting.
Isn't it true that thumb1 has only "adds r,r,#i" not "add r,r,#i"?
Lack of an addition that doesn't clobber the flags is one of the two
reasons why you'd want to enable the cmpelim pass.