[PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn

Uros Bizjak ubizjak@gmail.com
Fri Aug 19 22:20:00 GMT 2011


On Mon, Aug 15, 2011 at 5:25 PM, Michael Matz <matz@suse.de> wrote:

> On Mon, 15 Aug 2011, Michael Matz wrote:
>
>> > > .LFB0:
>> > >        .cfi_startproc
>> > >        movsd   .LC0(%rip), %xmm2
>> > >        movapd  %xmm0, %xmm1
>> > >        andpd   %xmm2, %xmm1
>> > >        andnpd  %xmm0, %xmm2
>> > >        addsd   .LC1(%rip), %xmm1
>> > >        roundsd $1, %xmm1, %xmm1
>> > >        orpd    %xmm2, %xmm1
>> > >        movapd  %xmm1, %xmm0
>> > >        ret
>> >
>> > Hm, why do we need the sign-copy?  If I read the docs correctly
>> > we can simply use roundsd directly, no?
>>
>> round-half-away-from-zero breaks your neck.  round[ps][sd] only supports
>> the usual four IEEE rounding modes.
>
> But, you should be able to apply the sign to the 0.5, which wouldn't
> require building the absolute value of input:
>
> round(x) = trunc(x + (copysign (0.5, x)))
>
> which should roughly be expanded to:
>
>        movsd   signbits(%rip), %xmm1
>       andpd   %xmm0, %xmm1
>       movsd   nextof0.5(%rip), %xmm2
>       orpd    %xmm1, %xmm2
>       addpd   %xmm2, %xmm0
>       roundsd $1, %xmm0, %xmm0
>        ret
>
> Which has one logical operation less (and one move because I chose a more
> optimal register assignment).

Thanks for the suggestion, I will implement and test it ASAP.

Uros.



More information about the Gcc-patches mailing list