Created attachment 22611 [details] example program for reproducing the wrong code generation The gcc versions 4.4.0-4.4.5 generates wrong code for an array access if some thing come together and it was very difficult to produce a nearly minimal test case. It seems to be that the generation of the code goes wrong if using size optimization, inline assembler and nested loops. Maybe the optimizer runs out of usable registers, because some registers are globbered by the inline assembler. The inline assembler is not from my self, because I used a macro from the avr-libc (version 1.6.8) for filling up a boot page for later writing this into the flash. The relevant code look as follows (in the code I expanded the macro directly): uint8_t array[256]={'A','B'}; int main(void) { uint8_t *buf=array; uint32_t page=0; uint16_t w; uint8_t y; uint16_t i; for (y=0;y<100;++y) { page=((uint16_t)y)<<8; for (i=0; i<10; i+=2) { w = (buf[i+1]); w<<=8; w|= buf[i]; __asm__ __volatile__ ( "movw r0, %4\n\t" "movw r30, %A3\n\t" "sts %1, %C3\n\t" "sts %0, %2\n\t" "spm\n\t" "clr r1\n\t" : : "i" (_SFR_MEM_ADDR(__SPM_REG)), "i" (_SFR_MEM_ADDR(RAMPZ)), "r" ((uint8_t)__BOOT_PAGE_FILL), "r" ((uint32_t)(page+i)), "r" ((uint16_t)w) : "r0", "r30", "r31" ); } } return 0; } To reproduce the bug, compile the provided attachment with: avr-gcc -Os main.cc -mmcu=atmega128 This generates, showing only the inner loop: ea: 60 e0 ldi r22, 0x00 ; 0 ec: eb 01 movw r28, r22 ee: 6c 91 ld r22, X f0: 70 e0 ldi r23, 0x00 ; 0 f2: 6c 2b or r22, r28 f4: 7d 2b or r23, r29 f6: 0b 01 movw r0, r22 f8: f9 01 movw r30, r18 fa: 40 93 5b 00 sts 0x005B, r20 fe: 10 93 68 00 sts 0x0068, r17 102: e8 95 spm 104: 11 24 eor r1, r1 106: 12 96 adiw r26, 0x02 ; 2 108: 2e 5f subi r18, 0xFE ; 254 10a: 3f 4f sbci r19, 0xFF ; 255 10c: 4f 4f sbci r20, 0xFF ; 255 10e: 5f 4f sbci r21, 0xFF ; 255 110: 71 e0 ldi r23, 0x01 ; 1 112: aa 30 cpi r26, 0x0A ; 10 114: b7 07 cpc r27, r23 116: 49 f7 brne .-46 ; 0xea <main+0x1c> and you see at 0xee the RAM is read, but only at this position, however, in the C-source we have two reads. This example compiled with gcc version 4.4.x generates wrong code, instead using gcc version 4.5.x it works as it should. However, I am not sure if this is fixed there or is this bug there also latently contained. Maybe, it is bug in the optimizer, which only needs another example to show up there too. Some information to the used compiler: avr-gcc -v Using built-in specs. Target: avr Configured with: /tmp/cross-build/gcc-4.4.0/configure --target=avr --prefix=/localapp/cross-gcc/builds/2.20.1-4.4.0-7.1/avr --program-prefix=avr- --with-gnu-ld --with-gnu-as --enable-languages=c,c++ Thread model: single gcc version 4.4.0 (GCC) The other compiler version are compiled with same configure flags.
at least I see r1 use in the asm but no clobber for it, instead it clobbers the seemingly unused r31.
(In reply to comment #1) > at least I see r1 use in the asm but no clobber for it, instead it clobbers > the seemingly unused r31. Yes correct, r1 is not clobbered, however, I tested it with clobbering this one too. But this does not change the code generation, leading to the same erroneous code. In case of r31, I think you are wrong because it is used in "movw r30, %A3\n\t". Are you agree this is a bug or not?
(In reply to comment #2) > (In reply to comment #1) > > at least I see r1 use in the asm but no clobber for it, instead it clobbers > > the seemingly unused r31. > Yes correct, r1 is not clobbered, however, I tested it with clobbering this one > too. But this does not change the code generation, leading to the same > erroneous code. In case of r31, I think you are wrong because it is used in > "movw r30, %A3\n\t". > > Are you agree this is a bug or not? I don't know anything about AVR, so I can't say that (I just noticed the r1 issue).
> I don't know anything about AVR, so I can't say that (I just noticed the r1 > issue). Ok due to the compiler abi for the avr, r0,r1 are fixed registers that are never allocated by gcc for local data, but often used for fixed purposes. r0 is a temporary register that may be globbered by any C-code. r1 is assumed to be always zero in any C-code. This is the reason, why it is cleared at the end of the inline assembler.
Probably we see PR45291 here introduced by the byte-shift w <<= 8. Some versions of avr backend emit bad subreg rtx which is still present in 4.5.2 and 4.6.0 http://gcc.gnu.org/PR45291 FYI, not clobbering R1 resp. R0 will not trigger a bug because there regs are fixed.
I can confirm the bug for gcc version 4.4.6 20110222 (prerelease) (GCC) In pass .168r.asmcons we have (insn 92 57 93 4 pr46779-1.c:17 (set (subreg:QI (reg/v:HI 85 [ w.30 ]) 1) (mem:QI (plus:HI (reg:HI 88 [ ivtmp.21 ]) (const_int 1 [0x1])) [0 S1 A8])) 4 {*movqi} (nil)) (insn 93 92 61 4 pr46779-1.c:17 (set (subreg:QI (reg/v:HI 85 [ w.30 ]) 0) (const_int 0 [0x0])) 4 {*movqi} (nil)) insn 93 invalidates insn 92. It accesses reg 85 as QI but doing so it invalidated the high part. This is due to subreg semantic. After IRA we have in .c.172r.ira: (insn 105 57 92 4 pr46779-1.c:17 (set (reg:HI 14 r14) (reg/v:HI 28 r28 [orig:85 w.30 ] [85])) 10 {*movhi} (nil)) (insn 92 105 107 4 pr46779-1.c:17 (set (reg:QI 22 r22) (mem:QI (plus:HI (reg:HI 26 r26 [orig:88 ivtmp.21 ] [88]) (const_int 1 [0x1])) [0 S1 A8])) 4 {*movqi} (nil)) (insn 107 92 106 4 pr46779-1.c:17 (set (reg:QI 15 r15 [+1 ]) (reg:QI 22 r22)) 4 {*movqi} (nil)) (insn 106 107 93 4 pr46779-1.c:17 (set (reg/v:HI 28 r28 [orig:85 w.30 ] [85]) (reg:HI 14 r14)) 10 {*movhi} (nil)) (insn 93 106 108 4 pr46779-1.c:17 (set (reg:QI 22 r22) (const_int 0 [0x0])) 4 {*movqi} (nil)) (insn 108 93 61 4 pr46779-1.c:17 (set (reg/v:HI 28 r28 [orig:85 w.30 ] [85]) (reg:HI 22 r22)) 10 {*movhi} (nil)) Now we see that insn 108 (from IRA/reload) overwrites the contents of reg 28. So in fact we see a bug similar to PR45291 which also originates in bad multiple subreg usage as Lvalue. The difference is that in 4.4 the bad subreg moves come from (define_insn_and_split "*rotlhi3_8" (define_insn_and_split "*rotlsi3_16" (define_insn_and_split "*rotlsi3_8" (define_insn_and_split "*rotlsi3_24" whereas in 4.5 and 4.6 they come from avr.c:avr_rotate_bytes()
Created attachment 23453 [details] Simpler example without #include Slightly simpler test case, compile with avr-gcc -S -Os -mmcu=atmega128 pr46779-1.c Pass .subreg1 emits subregs as Lvalues with the same pseudo as SUBREG_REG so that the second insn ovewrwrites the first. Target: avr Configured with: /gnu/source/gcc.gnu.org/branches/gcc-4_4-branch/configure --target=avr --prefix=/home/georg/gcc/install/gcc-4.4 --enable-languages=c,c++ --disable-libssp --disable-libada --disable-nls --disable-shared Thread model: single gcc version 4.4.6 20110222 (prerelease) (GCC) [...] GNU C (GCC) version 4.4.6 20110222 (prerelease) (avr) compiled by GNU C version 4.3.2 [gcc-4_3-branch revision 141291], GMP version 4.3.1, MPFR version 2.4.2.
Author: gjl Date: Thu Apr 14 18:50:02 2011 New Revision: 172442 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172442 Log: PR target/46779 PR target/45291 PR target/41894 * gcc.target/avr/pr46779-1.c: New test case * gcc.target/avr/pr46779-2.c: New test case Added: trunk/gcc/testsuite/gcc.target/avr/pr46779-1.c trunk/gcc/testsuite/gcc.target/avr/pr46779-2.c Modified: trunk/gcc/testsuite/ChangeLog
Reconfirmed with current 4.7 trunk against gcc.target/avr/pr46779-1.c
GCC 4.6.1 is being released.
Author: gjl Date: Fri Jul 8 17:38:39 2011 New Revision: 176053 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176053 Log: PR target/46779 * config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite. In particular, allow 8-bit values in r28 and r29. (avr_hard_regno_scratch_ok): Disallow any register that might be part of the frame pointer. (avr_hard_regno_rename_ok): Same. (avr_legitimate_address_p): Don't allow SUBREGs. Modified: trunk/gcc/ChangeLog trunk/gcc/config/avr/avr.c
Author: gjl Date: Fri Jul 8 17:46:38 2011 New Revision: 176055 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176055 Log: PR target/46779 Backport from mainline SVN 176053. 2011-07-08 Georg-Johann Lay * config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite. In particular, allow 8-bit values in r28 and r29. (avr_hard_regno_scratch_ok): Disallow any register that might be part of the frame pointer. (avr_hard_regno_rename_ok): Same. (avr_legitimate_address_p): Don't allow SUBREGs. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/avr/avr.c
*** Bug 45291 has been marked as a duplicate of this bug. ***
Closed as FIXED for 4.6.2
*** Bug 41894 has been marked as a duplicate of this bug. ***
*** Bug 51445 has been marked as a duplicate of this bug. ***
Author: gjl Date: Wed May 2 17:05:04 2012 New Revision: 187055 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187055 Log: Backport from 2011-07-08 4.6-branch r176055. PR target/46779 * config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite. In particular, allow 8-bit values in r28 and r29. (avr_hard_regno_scratch_ok): Disallow any register that might be part of the frame pointer. (avr_hard_regno_rename_ok): Same. (avr_legitimate_address_p): Don't allow SUBREGs. Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/config/avr/avr.c
Backported to for 4.5.4