PR 54168: Unnecessary narrowing in tree-ssa-forwprop pass?

Richard Earnshaw rearnsha@arm.com
Tue Jan 14 14:33:00 GMT 2014


On 14/01/14 14:25, Richard Biener wrote:
> On Tue, Jan 14, 2014 at 3:21 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> Moving to gcc, I accidentally sent it to gcc-patches previously...
>>
>>
>> On 14/01/14 14:09, Richard Biener wrote:
>>>
>>> On Tue, Jan 14, 2014 at 3:03 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I'm looking into PR 54168 where we end up generating an unnecessary
>>>> extend
>>>> operation on arm.
>>>>
>>>> Given code:
>>>>
>>>> struct b2Body {
>>>>      unsigned short flags;
>>>>      int type;
>>>> };
>>>>
>>>> static  _Bool IsAwake(struct b2Body *b)
>>>> {
>>>>      return (b->flags & 2) == 2;
>>>> }
>>>>
>>>>
>>>> int foo(struct b2Body *bA, struct b2Body *bB)
>>>> {
>>>>      int typeA = bA->type;
>>>>      int typeB = bB->type;
>>>>      _Bool activeA = IsAwake(bA) && typeA != 0;
>>>>      _Bool activeB = IsAwake(bB) && typeB != 0;
>>>>
>>>>      if (!activeA && !activeB)
>>>>      {
>>>>          return 1;
>>>>      }
>>>>
>>>>      return 0;
>>>> }
>>>>
>>>> Compiled for arm-none-eabi with -O3 -march=armv7-a
>>>>
>>>> The inlined generated code for IsAwake contains the fragment:
>>>>
>>>>          ldrh    r0, [r1]
>>>>          and     r0, r0, #2
>>>>          uxth    r0, r0
>>>>          cmp     r0, #0
>>>>
>>>> which contains a redundant extend, which also confuses combine and
>>>> prevents
>>>> the whole thing from being optimised into an ldrh ; ands sequence.
>>>>
>>>> Looking at the tree dumps I notice that after the forwprop pass the types
>>>> of
>>>> the operands in the _7 = _3 & 2; statement turn into short unsigned where
>>>> before that pass they were just ints:
>>>>
>>>> IsAwake (struct b2Body * b)
>>>> {
>>>>    short unsigned int _3;
>>>>    int _4;
>>>>    _Bool _6;
>>>>    short unsigned int _7;
>>>>
>>>>    <bb 2>:
>>>>    _3 = b_2(D)->flags;
>>>>    _4 = (int) _3;
>>>>    _7 = _3 & 2;
>>>>    _6 = _7 != 0;
>>>>    return _6;
>>>>
>>>> }
>>>>
>>>>
>>>> I believe the C standard expects the operation to be performed in int
>>>> mode.
>>>> Now, since this is a bitwise and operation with a known constant 2, the
>>>> operation can be safely performed in unsigned short mode. However on
>>>> word-based machines like arm this would introduce unnecessary extend
>>>> operations down the line, as I believe is the case here.
>>>
>>> Though the variant using shorts is clearly shorter (in number of stmts)
>>> and thus easier to optimize.  Am I correct that the issue in the end
>>> is that _7 != 0 requires to extend _7?  & 2 is trivially performed without
>>> any extension, no?
>>
>>
>> If I look at the dump before forwprop, the number of statements is exactly
>> the same, so it's not any shorter in that sense.
> 
> Well, it is - _4 = (int) _3; is dead, thus a zero-extension instruction
> is removed.
> 

That's a rather short-sighted definition of removed.  You've removed an
extension that:

a) can be merged with the preceding load
b) will have to be put back again anyway, when the _4 & 2 is expanded.

And finally, you end up with a second one when _5 != 0 is later expanded.

R.

>> <Dump from before forwprop>
>>
>> IsAwake (struct b2Body * b)
>> {
>>   short unsigned int _3;
>>   int _4;
>>   int _5;
>>   _Bool _6;
>>
>>
>>   <bb 2>:
>>   _3 = b_2(D)->flags;
>>   _4 = (int) _3;
>>   _5 = _4 & 2;
>>   _6 = _5 != 0;
>>   return _6;
>>
>> }
>>
>> Using shorts is not cheaper on an architecture like arm which is word-based.
>> Just the fact that we're introducing shorts already implies we're going to
>> have to extend somewhere.
> 
> But given the bit-and with '2' the extension can be removed, no?
> 
> Richard.
> 
>> Kyrill
>>
>>
>>>
>>> Richard.
>>>
>>>> Anyone have any insight on how to resolve this one?
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>
>>
> 




More information about the Gcc-patches mailing list