[Bug c/46779] New: wrong code generation for array access

Fri Dec 3 11:31:00 GMT 2010

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46779

           Summary: wrong code generation for array access
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: critical
          Priority: P3
         Component: c
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: mschulze@ivs.cs.ovgu.de
                CC: mschulze@ivs.cs.ovgu.de
            Target: avr-*-*

Created attachment 22611
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22611
example program for reproducing the wrong code generation

The gcc versions 4.4.0-4.4.5 generates wrong code for an array access if some
thing come together and it was very difficult to produce a nearly minimal test
case. It seems to be that the generation of the code goes wrong if using size
optimization, inline assembler and nested loops. Maybe the optimizer runs out
of usable registers, because some registers are globbered by the inline
assembler. The inline assembler is not from my self, because I used a macro
from the avr-libc (version 1.6.8) for filling up a boot page for later writing
this into the flash. The relevant code look as follows (in the code I expanded
the macro directly):

uint8_t array[256]={'A','B'};

int main(void) {
    uint8_t *buf=array;
    uint32_t page=0;
    uint16_t w;
    uint8_t y;
    uint16_t i;
    for (y=0;y<100;++y) {
        page=((uint16_t)y)<<8;
        for (i=0; i<10; i+=2) {
            w = (buf[i+1]);
            w<<=8;
            w|= buf[i];
            __asm__ __volatile__
            (
                "movw  r0, %4\n\t"
                "movw r30, %A3\n\t"
                "sts %1, %C3\n\t"
                "sts %0, %2\n\t"
                "spm\n\t"
                "clr  r1\n\t"
              :
              :
                  "i" (_SFR_MEM_ADDR(__SPM_REG)),
                  "i" (_SFR_MEM_ADDR(RAMPZ)),
                  "r" ((uint8_t)__BOOT_PAGE_FILL),
                  "r" ((uint32_t)(page+i)),
                  "r" ((uint16_t)w)
                : "r0", "r30", "r31"
                );
        }
    }
    return 0;
}

To reproduce the bug, compile the provided attachment with:

avr-gcc -Os main.cc -mmcu=atmega128

This generates, showing only the inner loop:

  ea:   60 e0           ldi     r22, 0x00       ; 0
  ec:   eb 01           movw    r28, r22
  ee:   6c 91           ld      r22, X
  f0:   70 e0           ldi     r23, 0x00       ; 0
  f2:   6c 2b           or      r22, r28
  f4:   7d 2b           or      r23, r29
  f6:   0b 01           movw    r0, r22
  f8:   f9 01           movw    r30, r18
  fa:   40 93 5b 00     sts     0x005B, r20
  fe:   10 93 68 00     sts     0x0068, r17
 102:   e8 95           spm
 104:   11 24           eor     r1, r1
 106:   12 96           adiw    r26, 0x02       ; 2
 108:   2e 5f           subi    r18, 0xFE       ; 254
 10a:   3f 4f           sbci    r19, 0xFF       ; 255
 10c:   4f 4f           sbci    r20, 0xFF       ; 255
 10e:   5f 4f           sbci    r21, 0xFF       ; 255
 110:   71 e0           ldi     r23, 0x01       ; 1
 112:   aa 30           cpi     r26, 0x0A       ; 10
 114:   b7 07           cpc     r27, r23
 116:   49 f7           brne    .-46            ; 0xea <main+0x1c>

and you see at 0xee the RAM is read, but only at this position, however, in the
C-source we have two reads.

This example compiled with gcc version 4.4.x generates wrong code, instead
using gcc version 4.5.x it works as it should. However, I am not sure if this
is fixed there or is this bug there also latently contained. Maybe, it is bug
in the optimizer, which only needs another example to show up there too.

Some information to the used compiler:
avr-gcc  -v 
Using built-in specs.        
Target: avr
Configured with: /tmp/cross-build/gcc-4.4.0/configure --target=avr
--prefix=/localapp/cross-gcc/builds/2.20.1-4.4.0-7.1/avr --program-prefix=avr-
--with-gnu-ld --with-gnu-as --enable-languages=c,c++
Thread model: single
gcc version 4.4.0 (GCC)

The other compiler version are compiled with same configure flags.