Re: [PATCH] fast Copysign for PPC, -flag_unsafe_math_optimizations only

On Jan 30, 2005, at 8:32 PM, Richard Henderson wrote:

On Sun, Jan 30, 2005 at 07:12:40PM -0500, Andrew Pinski wrote:
With this patch copysign is expanded to:
        fabs f0,f1
        fnabs f1,f1
        fsel f1,f2,f0,f1

Which is faster and smaller than storing and doing
bitfield operations on the sign bit.

I would be surprised if this ever gets used, but the proper test is HONOR_SIGNED_ZEROS, not flag_unsafe_math_optimizations.

I hear there are some uses of copysign in SPEC CPU 2004/5 which show up in the profile.

Since this does not support NANs either, I added the check for this too.
The code which is expanded currently with -ffast-math is:
        stfs f2,-16(r1)
        fabs f1,f1
        lwz r0,-16(r1)
        cmpwi cr7,r0,0
        bgelr- cr7
        fneg f1,f1

which is not as good as above since we have to move between a floating point
register to an integer register which can be painful on the G5 because of a
LSU (load/store unit) reject. (But that is better than before).

OK? Bootstrapped and tested on powerpc-darwin with no regressions.

Andrew Pinski

	* config/rs6000/ (copysignsf3): New expand.
	(copysigndf3): Likewise.

Attachment: fastcopysign.diff.txt
Description: Text document

