[PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn
Michael Matz
matz@suse.de
Mon Aug 15 17:12:00 GMT 2011
Hi,
On Mon, 15 Aug 2011, Michael Matz wrote:
> > > .LFB0:
> > > Â Â Â Â .cfi_startproc
> > >     movsd  .LC0(%rip), %xmm2
> > >     movapd  %xmm0, %xmm1
> > >     andpd  %xmm2, %xmm1
> > >     andnpd  %xmm0, %xmm2
> > >     addsd  .LC1(%rip), %xmm1
> > > Â Â Â Â roundsd $1, %xmm1, %xmm1
> > >     orpd   %xmm2, %xmm1
> > >     movapd  %xmm1, %xmm0
> > > Â Â Â Â ret
> >
> > Hm, why do we need the sign-copy? If I read the docs correctly
> > we can simply use roundsd directly, no?
>
> round-half-away-from-zero breaks your neck. round[ps][sd] only supports
> the usual four IEEE rounding modes.
But, you should be able to apply the sign to the 0.5, which wouldn't
require building the absolute value of input:
round(x) = trunc(x + (copysign (0.5, x)))
which should roughly be expanded to:
    movsd  signbits(%rip), %xmm1
   andpd  %xmm0, %xmm1
   movsd  nextof0.5(%rip), %xmm2
orpd %xmm1, %xmm2
addpd %xmm2, %xmm0
   roundsd $1, %xmm0, %xmm0
    ret
Which has one logical operation less (and one move because I chose a more
optimal register assignment).
Ciao,
Michael.
More information about the Gcc-patches
mailing list