PATCH: Add x86 integer intrinsics
H.J. Lu
hjl.tools@gmail.com
Wed Jun 10 15:25:00 GMT 2009
On Wed, Jun 10, 2009 at 7:51 AM, Paolo Bonzini<paolo.bonzini@gmail.com> wrote:
> H.J. Lu wrote:
>>
>> On Wed, Jun 10, 2009 at 7:12 AM, Paolo Bonzini<paolo.bonzini@gmail.com>
>> wrote:
>>>>
>>>> * config/i386/i386.c (ix86_builtins): Add IX86_BUILTIN_BSRSI,
>>>> IX86_BUILTIN_BSRDI. IX86_BUILTIN_RDPMC, IX86_BUILTIN_RDTSC.
>>>> IX86_BUILTIN_RDTSCP. IX86_BUILTIN_ROLQI, IX86_BUILTIN_ROLHI,
>>>> IX86_BUILTIN_ROLSI, IX86_BUILTIN_ROLDI, IX86_BUILTIN_RORQI,
>>>> IX86_BUILTIN_RORHI, IX86_BUILTIN_RORSI and IX86_BUILTIN_RORDI.
>>>
>>> Do you really need intrinsics for BSR and ROL/ROR, since we have ctz/clz
>>> and
>>> rotates are synthesized at fold time (so before inlining)?
>>
>> Gcc can generate them directly, which is independent with intrinsics.
>>
>>> Also, is BSF missing maybe?
>>
>> BSF is implemented with ctz builtins directly.
>
> Ah, okay, so BSR is needed. I still don't understand the reason for ROL/ROR
> intrinsics, since a ROTATE_EXPR will do as well and
>
> static inline rol(int x, int y)
> {
> return (x << y) | (x >> (32 - y));
> }
>
> will be folded directly to LROTATE_EXPR <x, y>.
Well, it isn't the case. At -O2, I got
#include <x86intrin.h>
int
rol(int x, int y)
{
return (x << y) | (x >> (32 - y));
}
int
rold (int x, int y)
{
return __rold (x, y);
}
[hjl@gnu-6 intrin-1]$ cat r.s
.file "r.c"
.text
.p2align 4,,15
.globl rol
.type rol, @function
rol:
movl $32, %ecx
movl %edi, %eax
subl %esi, %ecx
sarl %cl, %eax
movl %esi, %ecx
sall %cl, %edi
orl %edi, %eax
ret
.size rol, .-rol
.p2align 4,,15
.globl rold
.type rold, @function
rold:
movl %edi, %eax
movl %esi, %ecx
roll %cl, %eax
ret
.size rold, .-rold
--
H.J.
More information about the Gcc-patches
mailing list