[Bug c/46779] New: wrong code generation for array access
mschulze at ivs dot cs.ovgu.de
gcc-bugzilla@gcc.gnu.org
Fri Dec 3 11:31:00 GMT 2010
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46779
Summary: wrong code generation for array access
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: critical
Priority: P3
Component: c
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: mschulze@ivs.cs.ovgu.de
CC: mschulze@ivs.cs.ovgu.de
Target: avr-*-*
Created attachment 22611
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22611
example program for reproducing the wrong code generation
The gcc versions 4.4.0-4.4.5 generates wrong code for an array access if some
thing come together and it was very difficult to produce a nearly minimal test
case. It seems to be that the generation of the code goes wrong if using size
optimization, inline assembler and nested loops. Maybe the optimizer runs out
of usable registers, because some registers are globbered by the inline
assembler. The inline assembler is not from my self, because I used a macro
from the avr-libc (version 1.6.8) for filling up a boot page for later writing
this into the flash. The relevant code look as follows (in the code I expanded
the macro directly):
uint8_t array[256]={'A','B'};
int main(void) {
uint8_t *buf=array;
uint32_t page=0;
uint16_t w;
uint8_t y;
uint16_t i;
for (y=0;y<100;++y) {
page=((uint16_t)y)<<8;
for (i=0; i<10; i+=2) {
w = (buf[i+1]);
w<<=8;
w|= buf[i];
__asm__ __volatile__
(
"movw r0, %4\n\t"
"movw r30, %A3\n\t"
"sts %1, %C3\n\t"
"sts %0, %2\n\t"
"spm\n\t"
"clr r1\n\t"
:
:
"i" (_SFR_MEM_ADDR(__SPM_REG)),
"i" (_SFR_MEM_ADDR(RAMPZ)),
"r" ((uint8_t)__BOOT_PAGE_FILL),
"r" ((uint32_t)(page+i)),
"r" ((uint16_t)w)
: "r0", "r30", "r31"
);
}
}
return 0;
}
To reproduce the bug, compile the provided attachment with:
avr-gcc -Os main.cc -mmcu=atmega128
This generates, showing only the inner loop:
ea: 60 e0 ldi r22, 0x00 ; 0
ec: eb 01 movw r28, r22
ee: 6c 91 ld r22, X
f0: 70 e0 ldi r23, 0x00 ; 0
f2: 6c 2b or r22, r28
f4: 7d 2b or r23, r29
f6: 0b 01 movw r0, r22
f8: f9 01 movw r30, r18
fa: 40 93 5b 00 sts 0x005B, r20
fe: 10 93 68 00 sts 0x0068, r17
102: e8 95 spm
104: 11 24 eor r1, r1
106: 12 96 adiw r26, 0x02 ; 2
108: 2e 5f subi r18, 0xFE ; 254
10a: 3f 4f sbci r19, 0xFF ; 255
10c: 4f 4f sbci r20, 0xFF ; 255
10e: 5f 4f sbci r21, 0xFF ; 255
110: 71 e0 ldi r23, 0x01 ; 1
112: aa 30 cpi r26, 0x0A ; 10
114: b7 07 cpc r27, r23
116: 49 f7 brne .-46 ; 0xea <main+0x1c>
and you see at 0xee the RAM is read, but only at this position, however, in the
C-source we have two reads.
This example compiled with gcc version 4.4.x generates wrong code, instead
using gcc version 4.5.x it works as it should. However, I am not sure if this
is fixed there or is this bug there also latently contained. Maybe, it is bug
in the optimizer, which only needs another example to show up there too.
Some information to the used compiler:
avr-gcc -v
Using built-in specs.
Target: avr
Configured with: /tmp/cross-build/gcc-4.4.0/configure --target=avr
--prefix=/localapp/cross-gcc/builds/2.20.1-4.4.0-7.1/avr --program-prefix=avr-
--with-gnu-ld --with-gnu-as --enable-languages=c,c++
Thread model: single
gcc version 4.4.0 (GCC)
The other compiler version are compiled with same configure flags.
More information about the Gcc-bugs
mailing list