PR 54168: Unnecessary narrowing in tree-ssa-forwprop pass?

Richard Biener richard.guenther@gmail.com
Tue Jan 14 14:25:00 GMT 2014


On Tue, Jan 14, 2014 at 3:21 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Moving to gcc, I accidentally sent it to gcc-patches previously...
>
>
> On 14/01/14 14:09, Richard Biener wrote:
>>
>> On Tue, Jan 14, 2014 at 3:03 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com>
>> wrote:
>>>
>>> Hi all,
>>>
>>> I'm looking into PR 54168 where we end up generating an unnecessary
>>> extend
>>> operation on arm.
>>>
>>> Given code:
>>>
>>> struct b2Body {
>>>      unsigned short flags;
>>>      int type;
>>> };
>>>
>>> static  _Bool IsAwake(struct b2Body *b)
>>> {
>>>      return (b->flags & 2) == 2;
>>> }
>>>
>>>
>>> int foo(struct b2Body *bA, struct b2Body *bB)
>>> {
>>>      int typeA = bA->type;
>>>      int typeB = bB->type;
>>>      _Bool activeA = IsAwake(bA) && typeA != 0;
>>>      _Bool activeB = IsAwake(bB) && typeB != 0;
>>>
>>>      if (!activeA && !activeB)
>>>      {
>>>          return 1;
>>>      }
>>>
>>>      return 0;
>>> }
>>>
>>> Compiled for arm-none-eabi with -O3 -march=armv7-a
>>>
>>> The inlined generated code for IsAwake contains the fragment:
>>>
>>>          ldrh    r0, [r1]
>>>          and     r0, r0, #2
>>>          uxth    r0, r0
>>>          cmp     r0, #0
>>>
>>> which contains a redundant extend, which also confuses combine and
>>> prevents
>>> the whole thing from being optimised into an ldrh ; ands sequence.
>>>
>>> Looking at the tree dumps I notice that after the forwprop pass the types
>>> of
>>> the operands in the _7 = _3 & 2; statement turn into short unsigned where
>>> before that pass they were just ints:
>>>
>>> IsAwake (struct b2Body * b)
>>> {
>>>    short unsigned int _3;
>>>    int _4;
>>>    _Bool _6;
>>>    short unsigned int _7;
>>>
>>>    <bb 2>:
>>>    _3 = b_2(D)->flags;
>>>    _4 = (int) _3;
>>>    _7 = _3 & 2;
>>>    _6 = _7 != 0;
>>>    return _6;
>>>
>>> }
>>>
>>>
>>> I believe the C standard expects the operation to be performed in int
>>> mode.
>>> Now, since this is a bitwise and operation with a known constant 2, the
>>> operation can be safely performed in unsigned short mode. However on
>>> word-based machines like arm this would introduce unnecessary extend
>>> operations down the line, as I believe is the case here.
>>
>> Though the variant using shorts is clearly shorter (in number of stmts)
>> and thus easier to optimize.  Am I correct that the issue in the end
>> is that _7 != 0 requires to extend _7?  & 2 is trivially performed without
>> any extension, no?
>
>
> If I look at the dump before forwprop, the number of statements is exactly
> the same, so it's not any shorter in that sense.

Well, it is - _4 = (int) _3; is dead, thus a zero-extension instruction
is removed.

> <Dump from before forwprop>
>
> IsAwake (struct b2Body * b)
> {
>   short unsigned int _3;
>   int _4;
>   int _5;
>   _Bool _6;
>
>
>   <bb 2>:
>   _3 = b_2(D)->flags;
>   _4 = (int) _3;
>   _5 = _4 & 2;
>   _6 = _5 != 0;
>   return _6;
>
> }
>
> Using shorts is not cheaper on an architecture like arm which is word-based.
> Just the fact that we're introducing shorts already implies we're going to
> have to extend somewhere.

But given the bit-and with '2' the extension can be removed, no?

Richard.

> Kyrill
>
>
>>
>> Richard.
>>
>>> Anyone have any insight on how to resolve this one?
>>>
>>> Thanks,
>>> Kyrill
>>>
>
>



More information about the Gcc-patches mailing list