Bug 9663 - [arm] gcc-20030127 misses an optimization opportunity
: [arm] gcc-20030127 misses an optimization opportunity
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: target
: 3.3
: P3 enhancement
: ---
Assigned To: Not yet assigned to anyone
: http://gcc.gnu.org/ml/gcc-patches/200...
: missed-optimization, patch
:
:
  Show dependency treegraph
 
Reported: 2003-02-11 17:06 UTC by gertom
Modified: 2009-03-13 09:38 UTC (History)
3 users (show)

See Also:
Host:
Target: arm-elf
Build:
Known to work:
Known to fail:
Last reconfirmed: 2004-11-06 17:43:44


Attachments
patch for extending move and compare parallelization (619 bytes, application/x-compressed)
2003-06-03 13:38 UTC, Gábor Lóki
Details

Note You need to log in before you can comment on or make changes to this bug.
Description gertom 2003-02-11 17:06:00 UTC
The following two lines:

cmp r0, #0
mov lr, r0

should be replaced with:

subs lr, r0, #0

in the asm output.

20020503-1.i:
---
# 1 "20020503-1.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "20020503-1.c"



 
void abort (void);
static char *
inttostr (long i, char buf[128])
{
  unsigned long ui = i;
  char *p = buf + 127;
  *p = '\0';
  if (i < 0)
    ui = -ui;
  do
    *--p = '0' + ui % 10;
  while ((ui /= 10) != 0);
  if (i < 0)
    *--p = '-';
  return p;
}

int
main ()
{
  char buf[128], *p;

  p = inttostr (-1, buf);
  if (*p != '-')
    abort ();
  return 0;
}

Release:
gcc version 3.3 20030127 (prerelease)

Environment:
Built on: Linux 2.4.20 i686 unknown
Configured with: /home/gertom/gcc/src/gcc-20030127/configure --target=arm-elf --prefix=/home/gertom/gcc/build/install-20030127-arm-elf-orgn --enable-target-optspace --with-newlib --with-headers --disable-nls --disable-threads --disable-shared --disable-libgcj --disable-multilib --with-gnu-as --with-gnu-ld --enable-languages=c,c++
Thread model: single

How-To-Repeat:
just compile with -O2
Comment 1 gertom 2003-02-11 17:06:00 UTC
Fix:
See "PATCH: new peephole2 in arm.md" at http://gcc.gnu.org/ml/gcc-patches/2003-02/msg00204.html
Comment 2 Dara Hazeghi 2003-05-26 20:55:46 UTC
Hello,

this problem is still present on gcc 3.3 branch and mainline (20030509). Could you resend the 
patch to gcc-patches, and cc: one of the arm-maintainers? Also, can you attach the patch to this 
PR, so it doesn't get lost? Thanks,

Dara
Comment 3 Andrew Pinski 2003-05-26 20:57:05 UTC
See Dara's question.
Comment 4 Dara Hazeghi 2003-06-02 05:30:23 UTC
Shouldn't be in waiting. Confirmed.
Comment 5 Gábor Lóki 2003-06-03 13:38:24 UTC
Created attachment 4155 [details]
patch for extending move and compare parallelization

>Also, can you attach the patch to this PR, so it doesn't get lost?
I'am done with it.

Regards,
  Gabor Loki
Comment 6 Dara Hazeghi 2003-12-10 00:12:19 UTC
Confirmed still not present on mainline. Richard, would you mind looking at the
patch included 
here? Thanks.
Comment 7 Andrew Pinski 2004-05-21 03:54:16 UTC
Note you have to do with -fno-inline now on the mainline as the function is
inlined at -O2.
Comment 8 Ramana Radhakrishnan 2009-02-08 05:17:30 UTC
(In reply to comment #7)
> Note you have to do with -fno-inline now on the mainline as the function is
> inlined at -O2.
> 

It looks as though this is fixed in 4.3 and mainline today. I checked with 4.1
and saw that the problem existed in 4.1 


Looking at the assembly generated for the function, I no longer see a cmp and
mov as reported in the bug report. I see similar code generated in 4.1 but no
longer in 4.3 or 4.4. I see subs generated for 4.3 and 4.4 in the loop kernel
as of version r143940 for 4.3 and 144002 for mainline. 

Here is the snippet of code from 4.1, 4.3 and 4.4 as given below.


4.1 

.L8:
        ldr     r3, .L12
        umull   r1, r2, r3, ip
        mov     r2, r2, lsr #3
        mov     r3, r2, asl #1
        mov     r1, r2, asl #3
        add     r3, r3, r1
        rsb     r3, r3, ip
        add     r3, r3, #48
        cmp     r2, #0            ---- Insns from original bug report.
        mov     ip, r2            ----
        strb    r3, [r0, #-1]!
        bne     .L8



4.3
.L5:
    umull    r2, r3, r5, ip
    mov    r3, r3, lsr #3
    mov    r2, r3, asl #1
    mov    r1, r3, asl #3
    add    r2, r2, r1
    rsb    r2, r2, ip
    add    r2, r2, #48
    subs    ip, r3, #0
    strb    r2, [r0, #-1]!
    bne    .L5





4.4 

  .L5:
    umull    r1, r2, r4, ip
    mov    r2, r2, lsr #3
    add    r1, r2, r2, asl #2
    sub    ip, ip, r1, asl #1
    add    r1, ip, #48
    subs    ip, r2, #0
    strb    r1, [r0, #-1]!
    bne    .L5
Comment 9 Richard Earnshaw 2009-03-13 09:38:15 UTC
Now believed fixed.