This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR 54168: Unnecessary narrowing in tree-ssa-forwprop pass?


Moving to gcc, I accidentally sent it to gcc-patches previously...

On 14/01/14 14:09, Richard Biener wrote:
On Tue, Jan 14, 2014 at 3:03 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
Hi all,

I'm looking into PR 54168 where we end up generating an unnecessary extend
operation on arm.

Given code:

struct b2Body {
     unsigned short flags;
     int type;
};

static  _Bool IsAwake(struct b2Body *b)
{
     return (b->flags & 2) == 2;
}


int foo(struct b2Body *bA, struct b2Body *bB)
{
     int typeA = bA->type;
     int typeB = bB->type;
     _Bool activeA = IsAwake(bA) && typeA != 0;
     _Bool activeB = IsAwake(bB) && typeB != 0;

     if (!activeA && !activeB)
     {
         return 1;
     }

     return 0;
}

Compiled for arm-none-eabi with -O3 -march=armv7-a

The inlined generated code for IsAwake contains the fragment:

         ldrh    r0, [r1]
         and     r0, r0, #2
         uxth    r0, r0
         cmp     r0, #0

which contains a redundant extend, which also confuses combine and prevents
the whole thing from being optimised into an ldrh ; ands sequence.

Looking at the tree dumps I notice that after the forwprop pass the types of
the operands in the _7 = _3 & 2; statement turn into short unsigned where
before that pass they were just ints:

IsAwake (struct b2Body * b)
{
   short unsigned int _3;
   int _4;
   _Bool _6;
   short unsigned int _7;

   <bb 2>:
   _3 = b_2(D)->flags;
   _4 = (int) _3;
   _7 = _3 & 2;
   _6 = _7 != 0;
   return _6;

}


I believe the C standard expects the operation to be performed in int mode.
Now, since this is a bitwise and operation with a known constant 2, the
operation can be safely performed in unsigned short mode. However on
word-based machines like arm this would introduce unnecessary extend
operations down the line, as I believe is the case here.
Though the variant using shorts is clearly shorter (in number of stmts)
and thus easier to optimize.  Am I correct that the issue in the end
is that _7 != 0 requires to extend _7?  & 2 is trivially performed without
any extension, no?

If I look at the dump before forwprop, the number of statements is exactly the same, so it's not any shorter in that sense.

<Dump from before forwprop>
IsAwake (struct b2Body * b)
{
  short unsigned int _3;
  int _4;
  int _5;
  _Bool _6;

  <bb 2>:
  _3 = b_2(D)->flags;
  _4 = (int) _3;
  _5 = _4 & 2;
  _6 = _5 != 0;
  return _6;

}

Using shorts is not cheaper on an architecture like arm which is word-based. Just the fact that we're introducing shorts already implies we're going to have to extend somewhere.

Kyrill


Richard.

Anyone have any insight on how to resolve this one?

Thanks,
Kyrill




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]