This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PR 54168: Unnecessary narrowing in tree-ssa-forwprop pass?
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 14 Jan 2014 15:09:25 +0100
- Subject: Re: PR 54168: Unnecessary narrowing in tree-ssa-forwprop pass?
- Authentication-results: sourceware.org; auth=none
- References: <52D543C9 dot 4060105 at arm dot com>
On Tue, Jan 14, 2014 at 3:03 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> I'm looking into PR 54168 where we end up generating an unnecessary extend
> operation on arm.
>
> Given code:
>
> struct b2Body {
> unsigned short flags;
> int type;
> };
>
> static _Bool IsAwake(struct b2Body *b)
> {
> return (b->flags & 2) == 2;
> }
>
>
> int foo(struct b2Body *bA, struct b2Body *bB)
> {
> int typeA = bA->type;
> int typeB = bB->type;
> _Bool activeA = IsAwake(bA) && typeA != 0;
> _Bool activeB = IsAwake(bB) && typeB != 0;
>
> if (!activeA && !activeB)
> {
> return 1;
> }
>
> return 0;
> }
>
> Compiled for arm-none-eabi with -O3 -march=armv7-a
>
> The inlined generated code for IsAwake contains the fragment:
>
> ldrh r0, [r1]
> and r0, r0, #2
> uxth r0, r0
> cmp r0, #0
>
> which contains a redundant extend, which also confuses combine and prevents
> the whole thing from being optimised into an ldrh ; ands sequence.
>
> Looking at the tree dumps I notice that after the forwprop pass the types of
> the operands in the _7 = _3 & 2; statement turn into short unsigned where
> before that pass they were just ints:
>
> IsAwake (struct b2Body * b)
> {
> short unsigned int _3;
> int _4;
> _Bool _6;
> short unsigned int _7;
>
> <bb 2>:
> _3 = b_2(D)->flags;
> _4 = (int) _3;
> _7 = _3 & 2;
> _6 = _7 != 0;
> return _6;
>
> }
>
>
> I believe the C standard expects the operation to be performed in int mode.
> Now, since this is a bitwise and operation with a known constant 2, the
> operation can be safely performed in unsigned short mode. However on
> word-based machines like arm this would introduce unnecessary extend
> operations down the line, as I believe is the case here.
Though the variant using shorts is clearly shorter (in number of stmts)
and thus easier to optimize. Am I correct that the issue in the end
is that _7 != 0 requires to extend _7? & 2 is trivially performed without
any extension, no?
Richard.
> Anyone have any insight on how to resolve this one?
>
> Thanks,
> Kyrill
>