This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] fast Copysign for PPC, -flag_unsafe_math_optimizations only
Andrew Pinski <pinskia@physics.uc.edu> writes:
> On Jan 30, 2005, at 8:32 PM, Richard Henderson wrote:
>
> > On Sun, Jan 30, 2005 at 07:12:40PM -0500, Andrew Pinski wrote:
> >> With this patch copysign is expanded to:
> >> fabs f0,f1
> >> fnabs f1,f1
> >> fsel f1,f2,f0,f1
> >>
> >> Which is faster and smaller than storing and doing
> >> bitfield operations on the sign bit.
> >
> > I would be surprised if this ever gets used, but the proper
> > test is HONOR_SIGNED_ZEROS, not flag_unsafe_math_optimizations.
>
> I hear there are some uses of copysign in SPEC CPU 2004/5 which
> show up in the profile.
>
> Since this does not support NANs either, I added the check for this too.
> The code which is expanded currently with -ffast-math is:
> stfs f2,-16(r1)
> fabs f1,f1
> lwz r0,-16(r1)
> cmpwi cr7,r0,0
> bgelr- cr7
> fneg f1,f1
>
> which is not as good as above since we have to move between a floating
> point
> register to an integer register which can be painful on the G5 because
> of a
> LSU (load/store unit) reject. (But that is better than before).
>
> OK? Bootstrapped and tested on powerpc-darwin with no regressions.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
> * config/rs6000/rs6000.md (copysignsf3): New expand.
> (copysigndf3): Likewise.
This is OK, for 4.0 or 4.1 depending on the current state of 4.0.