[patch] misc bswap fixes

Uros Bizjak ubizjak@gmail.com
Mon Nov 6 21:22:00 GMT 2006


Eric Christopher wrote:

>> +(define_insn "bswapsi2"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +       (bswap:SI (match_operand:SI 1 "register_operand" "0")))
>> +   (clobber (reg:CC FLAGS_REG))]
>> +  "!TARGET_64BIT && TARGET_BSWAP"
>> +  "bswap\t%1"
>>
>> Does this patern need to be disabled on 64 bits? IMO it can still be
>> used with 32bit argument.
>>
>
> Not as far as I know, bswap operates on the entire register and  
> registers are 64-bit
> on x86_64.
>
Not really. The operating mode depends on the name of the register:
  4004b0:       0f cf                   bswap  %edi
  4004c0:       48 0f cf                bswap  %rdi

The later form is REX extended form, operating on REX register.

I suggest patterns to be  changed to [please note %k modifier in 
bswapsi2 pattern, it will append "d" suffix on REX registers]:

(define_insn "bswapsi2"
  [(set (match_operand:SI 0 "register_operand" "=r")
        (bswap:SI (match_operand:SI 1 "register_operand" "0")))
   (clobber (reg:CC FLAGS_REG))]
  "TARGET_BSWAP"
  "bswap\t%k0"
  [(set_attr "prefix_0f" "1")])

(define_insn "bswapdi2"
  [(set (match_operand:DI 0 "register_operand" "=r")
        (bswap:DI (match_operand:DI 1 "register_operand" "0")))
   (clobber (reg:CC FLAGS_REG))]
  "TARGET_64BIT && TARGET_BSWAP"
  "bswap\t%0"
  [(set_attr "prefix_0f" "1")])

These patterns will produce correct insns, as can be checked by 
following testcase:
--cut here--
unsigned long test_long (unsigned long xx)
{
        return __builtin_bswap64 (xx);
}

unsigned int test_int (unsigned int xx)
{
        return __builtin_bswap32 (xx);
}

int main()
{
  unsigned long xx = 0x1122334455667788;
  unsigned int yy = 0x11223344;

  printf("%#lx -> %#lx\n", xx, test_long (xx));
  printf("%#lx -> %#lx\n", yy, test_int (yy));

  return 0;
}

./a.out
0x1122334455667788 -> 0x8877665544332211
0x11223344 -> 0x44332211

objdump -S
00000000004004b0 <test_int>:
  4004b0:       0f cf                   bswap  %edi
  4004b2:       89 f8                   mov    %edi,%eax
  4004b4:       c3                      retq
...
00000000004004c0 <test_long>:
  4004c0:       48 0f cf                bswap  %rdi
  4004c3:       48 89 f8                mov    %rdi,%rax
  4004c6:       c3                      retq
...

BTW: You forgot to add appropriate "type" attribute in order to 
calculate correct length. ATM, "type" defaults to unknown, resulting in:
        bswap   %edi    # 11    bswapsi2        [length = 16]
and
        bswap   %rdi    # 8     bswapdi2        [length = 16]

BTW2: Another problem in this area is "automatic" generation of 16bit 
data swap (it is claimed in linux's byteswap.h that gcc can generate 
__bswap16 equivalent by iself by converting open-coded C insn to rotate 
insn). This is only true in part:

--cut here--
unsigned short bad(unsigned short a)
{
        return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}

unsigned short good(unsigned short a)
{
        return (a >> 8 | a << 8);
}
--cut here--

gcc -O2 -S -m32 -fomit-frame-pointer
bad:
        movzwl  4(%esp), %edx
        movl    %edx, %eax
        sall    $8, %eax
        shrl    $8, %edx
        orl     %edx, %eax
        movzwl  %ax, %eax
        ret

good:
        movzwl  4(%esp), %eax
        rolw    $8, %ax
        movzwl  %ax, %eax
        ret

IMO both forms should produce equal asm.

>> BTW: I wonder, why there is no bswaphi2 pattern. Intel recommends xchg
>> instruction to swap bytes in HImode argument. The relevant pattern
>> could be defined like:
>>
>
> Mostly because I hadn't done it yet. There's no support for  
> builtin_bswap16 yet.
> If you feel like supporting writing the rest of the support...

No problem, I'll do it as a follow-up to your patch.

Uros.



More information about the Gcc-patches mailing list