[PATCH] fast Copysign for PPC, -flag_unsafe_math_optimizations only
Andrew Pinski
pinskia@physics.uc.edu
Mon Jan 31 20:58:00 GMT 2005
On Jan 30, 2005, at 8:32 PM, Richard Henderson wrote:
> On Sun, Jan 30, 2005 at 07:12:40PM -0500, Andrew Pinski wrote:
>> With this patch copysign is expanded to:
>> fabs f0,f1
>> fnabs f1,f1
>> fsel f1,f2,f0,f1
>>
>> Which is faster and smaller than storing and doing
>> bitfield operations on the sign bit.
>
> I would be surprised if this ever gets used, but the proper
> test is HONOR_SIGNED_ZEROS, not flag_unsafe_math_optimizations.
I hear there are some uses of copysign in SPEC CPU 2004/5 which
show up in the profile.
Since this does not support NANs either, I added the check for this too.
The code which is expanded currently with -ffast-math is:
stfs f2,-16(r1)
fabs f1,f1
lwz r0,-16(r1)
cmpwi cr7,r0,0
bgelr- cr7
fneg f1,f1
which is not as good as above since we have to move between a floating
point
register to an integer register which can be painful on the G5 because
of a
LSU (load/store unit) reject. (But that is better than before).
OK? Bootstrapped and tested on powerpc-darwin with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* config/rs6000/rs6000.md (copysignsf3): New expand.
(copysigndf3): Likewise.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fastcopysign.diff.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050131/9c985564/attachment.txt>
More information about the Gcc-patches
mailing list