PATCH: Add x86 integer intrinsics

H.J. Lu hjl.tools@gmail.com
Wed Jun 10 15:25:00 GMT 2009


On Wed, Jun 10, 2009 at 7:51 AM, Paolo Bonzini<paolo.bonzini@gmail.com> wrote:
> H.J. Lu wrote:
>>
>> On Wed, Jun 10, 2009 at 7:12 AM, Paolo Bonzini<paolo.bonzini@gmail.com>
>> wrote:
>>>>
>>>>       * config/i386/i386.c (ix86_builtins): Add IX86_BUILTIN_BSRSI,
>>>>       IX86_BUILTIN_BSRDI.  IX86_BUILTIN_RDPMC, IX86_BUILTIN_RDTSC.
>>>>       IX86_BUILTIN_RDTSCP.  IX86_BUILTIN_ROLQI, IX86_BUILTIN_ROLHI,
>>>>       IX86_BUILTIN_ROLSI, IX86_BUILTIN_ROLDI, IX86_BUILTIN_RORQI,
>>>>       IX86_BUILTIN_RORHI, IX86_BUILTIN_RORSI and IX86_BUILTIN_RORDI.
>>>
>>> Do you really need intrinsics for BSR and ROL/ROR, since we have ctz/clz
>>> and
>>> rotates are synthesized at fold time (so before inlining)?
>>
>> Gcc can generate them directly, which is independent with intrinsics.
>>
>>> Also, is BSF missing maybe?
>>
>> BSF is implemented with ctz builtins directly.
>
> Ah, okay, so BSR is needed.  I still don't understand the reason for ROL/ROR
> intrinsics, since a ROTATE_EXPR will do as well and
>
> static inline rol(int x, int y)
> {
>  return (x << y) | (x >> (32 - y));
> }
>
> will be folded directly to LROTATE_EXPR <x, y>.

Well, it isn't the case. At -O2, I got

#include <x86intrin.h>

int
rol(int x, int y)
{
 return (x << y) | (x >> (32 - y));
 }

int
rold (int x, int y)
{
  return __rold (x, y);
}

[hjl@gnu-6 intrin-1]$ cat r.s
        .file   "r.c"
        .text
        .p2align 4,,15
.globl rol
        .type   rol, @function
rol:
        movl    $32, %ecx
        movl    %edi, %eax
        subl    %esi, %ecx
        sarl    %cl, %eax
        movl    %esi, %ecx
        sall    %cl, %edi
        orl     %edi, %eax
        ret
        .size   rol, .-rol
        .p2align 4,,15
.globl rold
        .type   rold, @function
rold:
        movl    %edi, %eax
        movl    %esi, %ecx
        roll    %cl, %eax
        ret
        .size   rold, .-rold



-- 
H.J.



More information about the Gcc-patches mailing list