[PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn
Uros Bizjak
ubizjak@gmail.com
Fri Aug 19 22:20:00 GMT 2011
On Mon, Aug 15, 2011 at 5:25 PM, Michael Matz <matz@suse.de> wrote:
> On Mon, 15 Aug 2011, Michael Matz wrote:
>
>> > > .LFB0:
>> > > .cfi_startproc
>> > > movsd .LC0(%rip), %xmm2
>> > > movapd %xmm0, %xmm1
>> > > andpd %xmm2, %xmm1
>> > > andnpd %xmm0, %xmm2
>> > > addsd .LC1(%rip), %xmm1
>> > > roundsd $1, %xmm1, %xmm1
>> > > orpd %xmm2, %xmm1
>> > > movapd %xmm1, %xmm0
>> > > ret
>> >
>> > Hm, why do we need the sign-copy? If I read the docs correctly
>> > we can simply use roundsd directly, no?
>>
>> round-half-away-from-zero breaks your neck. round[ps][sd] only supports
>> the usual four IEEE rounding modes.
>
> But, you should be able to apply the sign to the 0.5, which wouldn't
> require building the absolute value of input:
>
> round(x) = trunc(x + (copysign (0.5, x)))
>
> which should roughly be expanded to:
>
> movsd signbits(%rip), %xmm1
> andpd %xmm0, %xmm1
> movsd nextof0.5(%rip), %xmm2
> orpd %xmm1, %xmm2
> addpd %xmm2, %xmm0
> roundsd $1, %xmm0, %xmm0
> ret
>
> Which has one logical operation less (and one move because I chose a more
> optimal register assignment).
Thanks for the suggestion, I will implement and test it ASAP.
Uros.
More information about the Gcc-patches
mailing list