Endian swapping with C code on ARM

The ARM architecture is capable of endian swapping a 32bit value with 4
instructions. I've tried to coax gcc to do so with the following C code,
but so far without luck.

unsigned int endian_swap (unsigned int x)
    unsigned int t;
    t = x ^ ((x << 16) | (x >> 16));    /* eor  r1, r0, r0, ror #16 */
    t &= ~0x00ff0000;                   /* bic  r1, r1, #0x00FF0000 */
    x = (x << 24) | (x >> 8);           /* mov  r0, r0, ror #8      */
    x ^= (t >> 8);                      /* eor  r0, r0, r1, lsr #8  */

    return x;

gcc-3.3.3 with -Os -fomit-frame-pointer manages 5 instuctions (not
counting the function return), ie

   0:   e1a03860        mov     r3, r0, ror #16
   4:   e0203003        eor     r3, r0, r3
   8:   e3c338ff        bic     r3, r3, #16711680       ; 0xff0000
   c:   e1a00460        mov     r0, r0, ror #8
  10:   e0200423        eor     r0, r0, r3, lsr #8
  14:   e1a0f00e        mov     pc, lr

gcc-3.4.0 with the same switches does almost the same thing but seems
to have regressed slightly when performing register allocation and so
requires 6 instructions, ie

   0:   e1a03000        mov     r3, r0
   4:   e1a00860        mov     r0, r0, ror #16
   8:   e0230000        eor     r0, r3, r0
   c:   e3c008ff        bic     r0, r0, #16711680       ; 0xff0000
  10:   e1a03463        mov     r3, r3, ror #8
  14:   e0230420        eor     r0, r3, r0, lsr #8
  18:   e1a0f00e        mov     pc, lr

It looks like exor and rotation are not being combined when they could
be (although exor and logical shift right are...).

Is it a big task to add a tweak for this to gcc ?


