This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: ARM conditional instruction optimisation bug (feature?)
On 7/30/09, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On 7/30/09, Zoltán Kócsi <zoltan@bendor.com.au> wrote:
> > On the ARM every instruction can be executed conditionally. GCC very
> > cleverly uses this feature:
> >
> > int bar ( int x, int a, int b )
> > {
> > if ( x )
> >
> > return a;
> > else
> > return b;
> > }
> >
> > compiles to:
> >
> > bar:
> > cmp r0, #0 // test x
> > movne r0, r1 // retval = 'a' if !0 ('ne')
> > moveq r0, r2 // retval = 'b' if 0 ('eq')
> > bx lr
> >
> > However, the following function:
> >
> > extern unsigned array[ 128 ];
> >
> > int foo( int x )
> > {
> > int y;
> >
> > y = array[ x & 127 ];
> >
> > if ( x & 128 )
> >
> > y = 123456789 & ( y >> 2 );
> > else
> > y = 123456789 & y;
> >
> > return y;
> > }
> >
> > compiled with gcc 4.4.0, using -Os generates this:
> >
> > foo:
> >
> > ldr r3, .L8
> > tst r0, #128
> > and r0, r0, #127
> > ldr r3, [r3, r0, asl #2]
> > ldrne r0, .L8+4 ***
> > ldreq r0, .L8+4 ***
> > movne r3, r3, asr #2
> > andne r0, r3, r0 ***
> > andeq r0, r3, r0 ***
> > bx lr
> > .L8:
> > .word array
> > .word 123456789
> >
> > The lines marked with the *** -s do the same, one executing if the
> > condition is one way, the other if the condition is the opposite.
> > That is, together they perform one unconditional instruction, except
> > that they use two instuctions (and clocks) instead of one.
> >
> > Compiling with -O2 makes things even worse, because an other issue hits:
> > gcc sometimes changes a "load constant" to a "generate the constant on
> > the fly" even when the latter is both slower and larger, other times it
> > chooses to load a constant even when it can easily (and more cheaply)
> > generate it from already available values. In this particular case it
> > decides to build the constant from pieces and combines that with
> > the generate an unconditional instruction using two complementary
> > conditional instructions method, resulting in this:
> >
> > foo:
> > ldr r3, .L8
> > tst r0, #128
> > and r0, r0, #127
> > ldr r0, [r3, r0, asl #2]
> > movne r0, r0, asr #2
> > bicne r0, r0, #-134217728
> > biceq r0, r0, #-134217728
> > bicne r0, r0, #10747904
> > biceq r0, r0, #10747904
> > bicne r0, r0, #12992
> > biceq r0, r0, #12992
> > bicne r0, r0, #42
> > biceq r0, r0, #42
> > bx lr
> > .L8:
> > .word array
> >
> > Should I report a bug?
>
> This looks like my bug PR21803 (gcc.gnu.org/PR21803). Can you check if
> the ce3 pass creates this code? (Compile with -fdump-rtl-all and look
> at the .ce3 dump and one dump before to see if the .ce3 pass created
> your funny sequence.)
>
> If your problem is indeed caused by the ce3 pass, you should add your
> problem to PR21803, change the "Component" field to "middle-end", and
> adjust the bug summary to make it clear that this is not ia64
> specific.
Oh, and you may also want to try my patch "crossjump_abstract.diff" in
PR20070, it solves problems like yours sometimes (if the sequence is
just right) by crossjumping earlier.
Ciao!
Steven