[Bug c/34737] New: missed optimization, foo(p); p++ is better then foo(p++)

wvangulik at xs4all dot nl gcc-bugzilla@gcc.gnu.org
Fri Jan 11 08:58:00 GMT 2008


Consider the following:

char *x;
volatile int y;

void foo(char *p)
{
    y += *p;
}

void main(void)
{
    char *p1 = x;
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
    foo(p1++);
}

For the AVR target this will generate ugly code. Having a double saved variable
etc.

/* prologue: frame size=0 */
    push r14
    push r15
    push r16
    push r17
/* prologue end (size=4) */
    lds r24,x
    lds r25,(x)+1
    movw r16,r24
    subi r16,lo8(-(1))
    sbci r17,hi8(-(1))
    call foo
    movw r14,r16
    sec
    adc r14,__zero_reg__
    adc r15,__zero_reg__
    movw r24,r16
    call foo
    movw r16,r14
    subi r16,lo8(-(1))
    sbci r17,hi8(-(1))
    movw r24,r14
    call foo
etc..

The results gets much better when writing it like "foo(p); p++;"

/* prologue: frame size=0 */
        push r16
        push r17
/* prologue end (size=2) */
        movw r16,r24
        call foo
        subi r16,lo8(-(1))
        sbci r17,hi8(-(1))
        movw r24,r16
        call foo
        subi r16,lo8(-(1))
        sbci r17,hi8(-(1))

And the results get near optimal when using larger increments then the target
can add immediately ( >64). The compiler then adds the cumulative offset. Which
would be the most optimal case if also done for lower increments.

        movw r16,r24
        call foo
        movw r24,r16
        subi r24,lo8(-(65))
        sbci r25,hi8(-(65))
        call foo
        movw r24,r16
        subi r24,lo8(-(130))
        sbci r25,hi8(-(130))

This worst behaviour is shown for 4.1.2, 4.2.2, 4.3.0
Better results (still non-optimal) are with 3.4.6 and 3.3.6.
But 4.0.4 is producing the most optimal code for the original foo(p++)

Ugly code is also being seen for arm/thumb and pdp-11.
But good code for arm/arm

So it's a multi-target problem, not just the avr!


-- 
           Summary: missed optimization, foo(p); p++ is better then foo(p++)
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: wvangulik at xs4all dot nl
GCC target triplet: multiple-none-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34737



More information about the Gcc-bugs mailing list