[patch] misc bswap fixes
Uros Bizjak
ubizjak@gmail.com
Mon Nov 6 21:22:00 GMT 2006
Eric Christopher wrote:
>> +(define_insn "bswapsi2"
>> + [(set (match_operand:SI 0 "register_operand" "=r")
>> + (bswap:SI (match_operand:SI 1 "register_operand" "0")))
>> + (clobber (reg:CC FLAGS_REG))]
>> + "!TARGET_64BIT && TARGET_BSWAP"
>> + "bswap\t%1"
>>
>> Does this patern need to be disabled on 64 bits? IMO it can still be
>> used with 32bit argument.
>>
>
> Not as far as I know, bswap operates on the entire register and
> registers are 64-bit
> on x86_64.
>
Not really. The operating mode depends on the name of the register:
4004b0: 0f cf bswap %edi
4004c0: 48 0f cf bswap %rdi
The later form is REX extended form, operating on REX register.
I suggest patterns to be changed to [please note %k modifier in
bswapsi2 pattern, it will append "d" suffix on REX registers]:
(define_insn "bswapsi2"
[(set (match_operand:SI 0 "register_operand" "=r")
(bswap:SI (match_operand:SI 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
"TARGET_BSWAP"
"bswap\t%k0"
[(set_attr "prefix_0f" "1")])
(define_insn "bswapdi2"
[(set (match_operand:DI 0 "register_operand" "=r")
(bswap:DI (match_operand:DI 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
"TARGET_64BIT && TARGET_BSWAP"
"bswap\t%0"
[(set_attr "prefix_0f" "1")])
These patterns will produce correct insns, as can be checked by
following testcase:
--cut here--
unsigned long test_long (unsigned long xx)
{
return __builtin_bswap64 (xx);
}
unsigned int test_int (unsigned int xx)
{
return __builtin_bswap32 (xx);
}
int main()
{
unsigned long xx = 0x1122334455667788;
unsigned int yy = 0x11223344;
printf("%#lx -> %#lx\n", xx, test_long (xx));
printf("%#lx -> %#lx\n", yy, test_int (yy));
return 0;
}
./a.out
0x1122334455667788 -> 0x8877665544332211
0x11223344 -> 0x44332211
objdump -S
00000000004004b0 <test_int>:
4004b0: 0f cf bswap %edi
4004b2: 89 f8 mov %edi,%eax
4004b4: c3 retq
...
00000000004004c0 <test_long>:
4004c0: 48 0f cf bswap %rdi
4004c3: 48 89 f8 mov %rdi,%rax
4004c6: c3 retq
...
BTW: You forgot to add appropriate "type" attribute in order to
calculate correct length. ATM, "type" defaults to unknown, resulting in:
bswap %edi # 11 bswapsi2 [length = 16]
and
bswap %rdi # 8 bswapdi2 [length = 16]
BTW2: Another problem in this area is "automatic" generation of 16bit
data swap (it is claimed in linux's byteswap.h that gcc can generate
__bswap16 equivalent by iself by converting open-coded C insn to rotate
insn). This is only true in part:
--cut here--
unsigned short bad(unsigned short a)
{
return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}
unsigned short good(unsigned short a)
{
return (a >> 8 | a << 8);
}
--cut here--
gcc -O2 -S -m32 -fomit-frame-pointer
bad:
movzwl 4(%esp), %edx
movl %edx, %eax
sall $8, %eax
shrl $8, %edx
orl %edx, %eax
movzwl %ax, %eax
ret
good:
movzwl 4(%esp), %eax
rolw $8, %ax
movzwl %ax, %eax
ret
IMO both forms should produce equal asm.
>> BTW: I wonder, why there is no bswaphi2 pattern. Intel recommends xchg
>> instruction to swap bytes in HImode argument. The relevant pattern
>> could be defined like:
>>
>
> Mostly because I hadn't done it yet. There's no support for
> builtin_bswap16 yet.
> If you feel like supporting writing the rest of the support...
No problem, I'll do it as a follow-up to your patch.
Uros.
More information about the Gcc-patches
mailing list