[PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2

Carrot Wei carrot@google.com
Tue Nov 16 05:04:00 GMT 2010


On Tue, Nov 9, 2010 at 3:16 AM, Paul Brook <paul@codesourcery.com> wrote:
>> 2. Why do we need it for Thumb mode, and not for ARM mode?
>>
>> In thumb mode, instruction and with constant is 32 bit, uxtb is 16
>> bit, with this enhancement we can save 2 bytes.
>> In ARM mode, all instructions are 32 bit, convert and to uxtb doesn't
>> help us, so we don't care which instruction is used.
>> So we need it for Thumb mode only.
>
> No. Look at the output.
> In ARM mode we generate uxtb without your patch.
> In thumb mode gcc 4.5 generated uxtb, so this is a recent regression. What
> caused this regression?
>
I've built a new gcc 4.5 compiler, it actually can't generate uxtb in
thumb mode. I've also tried the compiler for each month from 2009 Jan
to now, all of them didn't generate uxtb in thumb mode. And all of
them generated uxtb in arm mode.

> The fact that this used to work (without andsi expander hacks), and still
> works in ARM mode, suggests there's something you're missing. Given there's
> clearly code somewhere capable of dong this transformation, it seems
> surprising that it's not already undoing you expander hack, and may do so in
> the future.
>

For arm mode, the rtl before combine is:

(insn 2 5 3 2 src/tp.c:2 (set (reg/v:SI 134 [ x ])
        (reg:SI 0 r0 [ x ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 0 r0 [ x ])
        (nil)))

(insn 3 2 4 2 src/tp.c:2 (set (reg/v:SI 135 [ y ])
        (reg:SI 1 r1 [ y ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 1 r1 [ y ])
        (nil)))

(note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)

(insn 7 4 8 2 src/tp.c:2 (set (reg:SI 137)
        (and:SI (reg/v:SI 134 [ x ])
            (const_int 255 [0xff]))) 67 {*arm_andsi3_insn}
(expr_list:REG_DEAD (reg/v:SI 134 [ x ])
        (nil)))

(insn 8 7 9 2 src/tp.c:2 (set (reg:SI 139)
        (const_int 65535 [0xffff])) 167 {*arm_movsi_insn} (nil))

(insn 9 8 10 2 src/tp.c:2 (set (reg:SI 138)
        (and:SI (reg/v:SI 135 [ y ])
            (reg:SI 139))) 67 {*arm_andsi3_insn} (expr_list:REG_DEAD
(reg:SI 139)
        (expr_list:REG_DEAD (reg/v:SI 135 [ y ])
            (expr_list:REG_EQUAL (and:SI (reg/v:SI 135 [ y ])
                    (const_int 65535 [0xffff]))
                (nil)))))

...

At combine pass, insn 2 and insn 7 are combined into a single unsigned
extend instruction, similarly insns 3,8,9 are combined into another
unsigned extend instruction, the result is:

(insn 7 4 8 2 src/tp.c:2 (set (reg:SI 137)
        (zero_extend:SI (reg:QI 0 r0 [ x ]))) 149
{*arm_zero_extendqisi2_v6} (expr_list:REG_DEAD (reg:SI 0 r0 [ x ])
        (nil)))

(note 8 7 9 2 NOTE_INSN_DELETED)

(insn 9 8 10 2 src/tp.c:2 (set (reg:SI 138)
        (zero_extend:SI (reg:HI 1 r1 [ y ]))) 144
{*arm_zero_extendhisi2_v6} (expr_list:REG_DEAD (reg:SI 1 r1 [ y ])
        (nil)))
...

In thumb mode, insn 2 and 7 are failed to be combined because function
cant_combine_insn_p returns true for insn 2, which is caused by
CLASS_LIKELY_SPILLED_P returns true for register r0. The
implementation of CLASS_LIKELY_SPILLED_P is

static bool
arm_class_likely_spilled_p (reg_class_t rclass)
{
  if ((TARGET_THUMB && rclass == LO_REGS)
      || rclass  == CC_REG)
    return true;

  return false;
}

The problem is the condition (TARGET_THUMB && rclass == LO_REGS), in
thumb2 we can use the same number of registers as arm mode, so it
should not be likely spilled. The condition should be changed to

    (TARGET_THUMB1 && rclass == LO_REGS)

After this change, gcc now can generate uxtb instruction.

But it is still not general enough because the combine needs two or
more instructions, so it actually depends on the existence of the
parameter

(insn 2 5 3 2 src/tp.c:2 (set (reg/v:SI 134 [ x ])
        (reg:SI 0 r0 [ x ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 0 r0 [ x ])
        (nil)))

If I modify the test case a little

int tp(int x, int y)
{
  return ((x + 3) & 0xff) - (y & 0xffff);
}

The compiler always generates "and" instruction for both arm and thumb
modes. But the enhanced expander can still handle this case.

thanks
Carrot



More information about the Gcc-patches mailing list