This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] misc bswap fixes


Eric Christopher wrote:

+(define_insn "bswapsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+       (bswap:SI (match_operand:SI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "!TARGET_64BIT && TARGET_BSWAP"
+  "bswap\t%1"

Does this patern need to be disabled on 64 bits? IMO it can still be
used with 32bit argument.


Not as far as I know, bswap operates on the entire register and registers are 64-bit
on x86_64.


Not really. The operating mode depends on the name of the register:
 4004b0:       0f cf                   bswap  %edi
 4004c0:       48 0f cf                bswap  %rdi

The later form is REX extended form, operating on REX register.

I suggest patterns to be changed to [please note %k modifier in bswapsi2 pattern, it will append "d" suffix on REX registers]:

(define_insn "bswapsi2"
 [(set (match_operand:SI 0 "register_operand" "=r")
       (bswap:SI (match_operand:SI 1 "register_operand" "0")))
  (clobber (reg:CC FLAGS_REG))]
 "TARGET_BSWAP"
 "bswap\t%k0"
 [(set_attr "prefix_0f" "1")])

(define_insn "bswapdi2"
 [(set (match_operand:DI 0 "register_operand" "=r")
       (bswap:DI (match_operand:DI 1 "register_operand" "0")))
  (clobber (reg:CC FLAGS_REG))]
 "TARGET_64BIT && TARGET_BSWAP"
 "bswap\t%0"
 [(set_attr "prefix_0f" "1")])

These patterns will produce correct insns, as can be checked by following testcase:
--cut here--
unsigned long test_long (unsigned long xx)
{
return __builtin_bswap64 (xx);
}


unsigned int test_int (unsigned int xx)
{
       return __builtin_bswap32 (xx);
}

int main()
{
 unsigned long xx = 0x1122334455667788;
 unsigned int yy = 0x11223344;

 printf("%#lx -> %#lx\n", xx, test_long (xx));
 printf("%#lx -> %#lx\n", yy, test_int (yy));

 return 0;
}

./a.out
0x1122334455667788 -> 0x8877665544332211
0x11223344 -> 0x44332211

objdump -S
00000000004004b0 <test_int>:
 4004b0:       0f cf                   bswap  %edi
 4004b2:       89 f8                   mov    %edi,%eax
 4004b4:       c3                      retq
...
00000000004004c0 <test_long>:
 4004c0:       48 0f cf                bswap  %rdi
 4004c3:       48 89 f8                mov    %rdi,%rax
 4004c6:       c3                      retq
...

BTW: You forgot to add appropriate "type" attribute in order to calculate correct length. ATM, "type" defaults to unknown, resulting in:
bswap %edi # 11 bswapsi2 [length = 16]
and
bswap %rdi # 8 bswapdi2 [length = 16]


BTW2: Another problem in this area is "automatic" generation of 16bit data swap (it is claimed in linux's byteswap.h that gcc can generate __bswap16 equivalent by iself by converting open-coded C insn to rotate insn). This is only true in part:

--cut here--
unsigned short bad(unsigned short a)
{
       return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}

unsigned short good(unsigned short a)
{
       return (a >> 8 | a << 8);
}
--cut here--

gcc -O2 -S -m32 -fomit-frame-pointer
bad:
       movzwl  4(%esp), %edx
       movl    %edx, %eax
       sall    $8, %eax
       shrl    $8, %edx
       orl     %edx, %eax
       movzwl  %ax, %eax
       ret

good:
       movzwl  4(%esp), %eax
       rolw    $8, %ax
       movzwl  %ax, %eax
       ret

IMO both forms should produce equal asm.

BTW: I wonder, why there is no bswaphi2 pattern. Intel recommends xchg
instruction to swap bytes in HImode argument. The relevant pattern
could be defined like:


Mostly because I hadn't done it yet. There's no support for builtin_bswap16 yet.
If you feel like supporting writing the rest of the support...

No problem, I'll do it as a follow-up to your patch.


Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]