Bug 46779 - [4.4/4.5/4.6 Regression][avr] wrong code generation for values held in R28/R29
Summary: [4.4/4.5/4.6 Regression][avr] wrong code generation for values held in R28/R29
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.5
: P3 normal
Target Milestone: 4.6.2
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 41894 45291 51445 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-12-03 11:31 UTC by Michael Schulze
Modified: 2012-05-02 17:08 UTC (History)
6 users (show)

See Also:
Host:
Target: avr-*-*
Build:
Known to work: 4.5.4, 4.6.2, 4.7.0
Known to fail: 4.4.0, 4.4.1, 4.4.2, 4.4.3, 4.4.4, 4.4.5, 4.5.3, 4.6.1
Last reconfirmed: 2011-06-16 10:01:03


Attachments
example program for reproducing the wrong code generation (382 bytes, text/x-c++src)
2010-12-03 11:31 UTC, Michael Schulze
Details
Simpler example without #include (337 bytes, text/plain)
2011-02-24 13:58 UTC, Georg-Johann Lay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Schulze 2010-12-03 11:31:11 UTC
Created attachment 22611 [details]
example program for reproducing the wrong code generation

The gcc versions 4.4.0-4.4.5 generates wrong code for an array access if some thing come together and it was very difficult to produce a nearly minimal test case. It seems to be that the generation of the code goes wrong if using size optimization, inline assembler and nested loops. Maybe the optimizer runs out of usable registers, because some registers are globbered by the inline assembler. The inline assembler is not from my self, because I used a macro from the avr-libc (version 1.6.8) for filling up a boot page for later writing this into the flash. The relevant code look as follows (in the code I expanded the macro directly):

uint8_t array[256]={'A','B'};

int main(void) {
    uint8_t *buf=array;
    uint32_t page=0;
    uint16_t w;
    uint8_t y;
    uint16_t i;
    for (y=0;y<100;++y) {
        page=((uint16_t)y)<<8;
        for (i=0; i<10; i+=2) {
            w = (buf[i+1]);
            w<<=8;
            w|= buf[i];
            __asm__ __volatile__
            (
                "movw  r0, %4\n\t"
                "movw r30, %A3\n\t"
                "sts %1, %C3\n\t"
                "sts %0, %2\n\t"
                "spm\n\t"
                "clr  r1\n\t"
              :
              :
                  "i" (_SFR_MEM_ADDR(__SPM_REG)),
                  "i" (_SFR_MEM_ADDR(RAMPZ)),
                  "r" ((uint8_t)__BOOT_PAGE_FILL),
                  "r" ((uint32_t)(page+i)),
                  "r" ((uint16_t)w)
                : "r0", "r30", "r31"
                );
        }
    }
    return 0;
}

To reproduce the bug, compile the provided attachment with:

avr-gcc -Os main.cc -mmcu=atmega128

This generates, showing only the inner loop:

  ea:   60 e0           ldi     r22, 0x00       ; 0
  ec:   eb 01           movw    r28, r22
  ee:   6c 91           ld      r22, X
  f0:   70 e0           ldi     r23, 0x00       ; 0
  f2:   6c 2b           or      r22, r28
  f4:   7d 2b           or      r23, r29
  f6:   0b 01           movw    r0, r22
  f8:   f9 01           movw    r30, r18
  fa:   40 93 5b 00     sts     0x005B, r20
  fe:   10 93 68 00     sts     0x0068, r17
 102:   e8 95           spm
 104:   11 24           eor     r1, r1
 106:   12 96           adiw    r26, 0x02       ; 2
 108:   2e 5f           subi    r18, 0xFE       ; 254
 10a:   3f 4f           sbci    r19, 0xFF       ; 255
 10c:   4f 4f           sbci    r20, 0xFF       ; 255
 10e:   5f 4f           sbci    r21, 0xFF       ; 255
 110:   71 e0           ldi     r23, 0x01       ; 1
 112:   aa 30           cpi     r26, 0x0A       ; 10
 114:   b7 07           cpc     r27, r23
 116:   49 f7           brne    .-46            ; 0xea <main+0x1c>

and you see at 0xee the RAM is read, but only at this position, however, in the C-source we have two reads.

This example compiled with gcc version 4.4.x generates wrong code, instead using gcc version 4.5.x it works as it should. However, I am not sure if this is fixed there or is this bug there also latently contained. Maybe, it is bug in the optimizer, which only needs another example to show up there too.

Some information to the used compiler:
avr-gcc  -v 
Using built-in specs.        
Target: avr
Configured with: /tmp/cross-build/gcc-4.4.0/configure --target=avr --prefix=/localapp/cross-gcc/builds/2.20.1-4.4.0-7.1/avr --program-prefix=avr- --with-gnu-ld --with-gnu-as --enable-languages=c,c++
Thread model: single
gcc version 4.4.0 (GCC)

The other compiler version are compiled with same configure flags.
Comment 1 Richard Biener 2010-12-03 13:37:46 UTC
at least I see r1 use in the asm but no clobber for it, instead it clobbers
the seemingly unused r31.
Comment 2 Michael Schulze 2010-12-03 13:58:40 UTC
(In reply to comment #1)
> at least I see r1 use in the asm but no clobber for it, instead it clobbers
> the seemingly unused r31.
Yes correct, r1 is not clobbered, however, I tested it with clobbering this one too. But this does not change the code generation, leading to the same erroneous code. In case of r31, I think you are wrong because it is used in "movw r30, %A3\n\t".

Are you agree this is a bug or not?
Comment 3 Richard Biener 2010-12-03 14:05:42 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > at least I see r1 use in the asm but no clobber for it, instead it clobbers
> > the seemingly unused r31.
> Yes correct, r1 is not clobbered, however, I tested it with clobbering this one
> too. But this does not change the code generation, leading to the same
> erroneous code. In case of r31, I think you are wrong because it is used in
> "movw r30, %A3\n\t".
> 
> Are you agree this is a bug or not?

I don't know anything about AVR, so I can't say that (I just noticed the r1
issue).
Comment 4 Michael Schulze 2010-12-03 14:16:30 UTC
> I don't know anything about AVR, so I can't say that (I just noticed the r1
> issue).
Ok due to the compiler abi for the avr, r0,r1 are fixed registers that are never allocated by gcc for local data, but often used for fixed purposes. r0 is a temporary register that may be globbered by any C-code. r1 is assumed to be always zero in any C-code. This is the reason, why it is cleared at the end of the inline assembler.
Comment 5 Georg-Johann Lay 2011-02-05 16:01:14 UTC
Probably we see PR45291 here introduced by the byte-shift w <<= 8. Some versions of avr backend emit bad subreg rtx which is still present in 4.5.2 and 4.6.0

http://gcc.gnu.org/PR45291

FYI, not clobbering R1 resp. R0 will not trigger a bug because there regs are fixed.
Comment 6 Georg-Johann Lay 2011-02-23 15:51:11 UTC
I can confirm the bug for gcc version 4.4.6 20110222 (prerelease) (GCC)

In pass .168r.asmcons we have

(insn 92 57 93 4 pr46779-1.c:17 (set (subreg:QI (reg/v:HI 85 [ w.30 ]) 1)
        (mem:QI (plus:HI (reg:HI 88 [ ivtmp.21 ])
                (const_int 1 [0x1])) [0 S1 A8])) 4 {*movqi} (nil))

(insn 93 92 61 4 pr46779-1.c:17 (set (subreg:QI (reg/v:HI 85 [ w.30 ]) 0)
        (const_int 0 [0x0])) 4 {*movqi} (nil))

insn 93 invalidates insn 92. It accesses reg 85 as QI but doing so it invalidated the high part. This is due to subreg semantic. After IRA we have in .c.172r.ira:
(insn 105 57 92 4 pr46779-1.c:17 (set (reg:HI 14 r14)
        (reg/v:HI 28 r28 [orig:85 w.30 ] [85])) 10 {*movhi} (nil))

(insn 92 105 107 4 pr46779-1.c:17 (set (reg:QI 22 r22)
        (mem:QI (plus:HI (reg:HI 26 r26 [orig:88 ivtmp.21 ] [88])
                (const_int 1 [0x1])) [0 S1 A8])) 4 {*movqi} (nil))

(insn 107 92 106 4 pr46779-1.c:17 (set (reg:QI 15 r15 [+1 ])
        (reg:QI 22 r22)) 4 {*movqi} (nil))

(insn 106 107 93 4 pr46779-1.c:17 (set (reg/v:HI 28 r28 [orig:85 w.30 ] [85])
        (reg:HI 14 r14)) 10 {*movhi} (nil))

(insn 93 106 108 4 pr46779-1.c:17 (set (reg:QI 22 r22)
        (const_int 0 [0x0])) 4 {*movqi} (nil))

(insn 108 93 61 4 pr46779-1.c:17 (set (reg/v:HI 28 r28 [orig:85 w.30 ] [85])
        (reg:HI 22 r22)) 10 {*movhi} (nil))

Now we see that insn 108 (from IRA/reload) overwrites the contents of reg 28.

So in fact we see a bug similar to PR45291 which also originates in bad multiple subreg usage as Lvalue.

The difference is that in 4.4 the bad subreg moves come from 

(define_insn_and_split "*rotlhi3_8"
(define_insn_and_split "*rotlsi3_16"
(define_insn_and_split "*rotlsi3_8"
(define_insn_and_split "*rotlsi3_24"

whereas in 4.5 and 4.6 they come from avr.c:avr_rotate_bytes()
Comment 7 Georg-Johann Lay 2011-02-24 13:58:58 UTC
Created attachment 23453 [details]
Simpler example without #include

Slightly simpler test case, compile with

avr-gcc -S -Os -mmcu=atmega128 pr46779-1.c

Pass .subreg1 emits subregs as Lvalues with the same pseudo as SUBREG_REG so that the second insn ovewrwrites the first.

Target: avr
Configured with: /gnu/source/gcc.gnu.org/branches/gcc-4_4-branch/configure --target=avr --prefix=/home/georg/gcc/install/gcc-4.4 --enable-languages=c,c++ --disable-libssp --disable-libada --disable-nls --disable-shared
Thread model: single
gcc version 4.4.6 20110222 (prerelease) (GCC)
[...]
GNU C (GCC) version 4.4.6 20110222 (prerelease) (avr)
        compiled by GNU C version 4.3.2 [gcc-4_3-branch revision 141291], GMP version 4.3.1, MPFR version 2.4.2.
Comment 8 Georg-Johann Lay 2011-04-14 18:50:16 UTC
Author: gjl
Date: Thu Apr 14 18:50:02 2011
New Revision: 172442

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172442
Log:
	PR target/46779
	PR target/45291
	PR target/41894
	* gcc.target/avr/pr46779-1.c: New test case
	* gcc.target/avr/pr46779-2.c: New test case


Added:
    trunk/gcc/testsuite/gcc.target/avr/pr46779-1.c
    trunk/gcc/testsuite/gcc.target/avr/pr46779-2.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 9 Georg-Johann Lay 2011-06-16 10:01:03 UTC
Reconfirmed with current 4.7 trunk against 

gcc.target/avr/pr46779-1.c
Comment 10 Jakub Jelinek 2011-06-27 12:32:54 UTC
GCC 4.6.1 is being released.
Comment 11 Georg-Johann Lay 2011-07-08 17:38:43 UTC
Author: gjl
Date: Fri Jul  8 17:38:39 2011
New Revision: 176053

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176053
Log:
	
	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.
	(avr_legitimate_address_p): Don't allow SUBREGs.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.c
Comment 12 Georg-Johann Lay 2011-07-08 17:46:42 UTC
Author: gjl
Date: Fri Jul  8 17:46:38 2011
New Revision: 176055

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176055
Log:
	
	PR target/46779
	Backport from mainline SVN 176053.
	2011-07-08  Georg-Johann Lay
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.
	(avr_legitimate_address_p): Don't allow SUBREGs.


Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/avr/avr.c
Comment 13 Georg-Johann Lay 2011-07-08 17:52:08 UTC
*** Bug 45291 has been marked as a duplicate of this bug. ***
Comment 14 Georg-Johann Lay 2011-07-10 20:57:18 UTC
Closed as FIXED for 4.6.2
Comment 15 Georg-Johann Lay 2011-07-23 15:09:49 UTC
*** Bug 41894 has been marked as a duplicate of this bug. ***
Comment 16 Georg-Johann Lay 2012-02-11 10:54:54 UTC
*** Bug 51445 has been marked as a duplicate of this bug. ***
Comment 17 Georg-Johann Lay 2012-05-02 17:05:10 UTC
Author: gjl
Date: Wed May  2 17:05:04 2012
New Revision: 187055

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187055
Log:
	Backport from 2011-07-08 4.6-branch r176055.
	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.
	(avr_legitimate_address_p): Don't allow SUBREGs.


Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/config/avr/avr.c
Comment 18 Georg-Johann Lay 2012-05-02 17:08:06 UTC
Backported to for 4.5.4