Bug 52278 - [4.8/4.9/5 Regression] [avr] inefficient register allocation for SUBREGs
Summary: [4.8/4.9/5 Regression] [avr] inefficient register allocation for SUBREGs
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: other (show other bugs)
Version: 4.7.0
: P4 normal
Target Milestone: 4.8.4
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, ra
: 47644 (view as bug list)
Depends on:
Blocks: 56183
  Show dependency treegraph
 
Reported: 2012-02-16 13:56 UTC by Georg-Johann Lay
Modified: 2014-10-07 17:13 UTC (History)
4 users (show)

See Also:
Host:
Target: avr
Build:
Known to work: 3.4.6, 4.6.2, 4.8.0, 4.9.2
Known to fail: 4.3.3, 4.5.2, 4.7.2
Last reconfirmed: 2012-02-28 00:00:00


Attachments
add.c (105 bytes, text/plain)
2012-02-16 14:00 UTC, Georg-Johann Lay
Details
add.s (451 bytes, text/plain)
2012-02-16 14:03 UTC, Georg-Johann Lay
Details
add.c.197r.ira (2.28 KB, text/plain)
2012-02-16 14:04 UTC, Georg-Johann Lay
Details
add.c.198r.reload (1.59 KB, text/plain)
2012-02-16 14:06 UTC, Georg-Johann Lay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2012-02-16 13:56:55 UTC
Suppose the following small function compiled for AVR.
Remember AVR is 8-bit machine with int = HImode and UNITS_PER_WORD = 1.

int add (int val)
{
    return val + 1;
}

The addition can be performed in one insn; val and return value are passed in
HI:24 as you can see in .ira dump:


(insn 6 3 19 2 (parallel [
            (set (reg:HI 45)
                (plus:HI (reg:HI 24 r24 [ val ])
                    (const_int 1 [0x1])))
            (clobber (scratch:QI))
        ]) add.c:3 42 {addhi3_clobber}
     (expr_list:REG_DEAD (reg:HI 24 r24 [ val ])
        (nil)))

(insn 19 6 20 2 (set (reg:QI 24 r24)
        (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}
     (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
        (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
     (expr_list:REG_DEAD (reg:HI 45)
        (nil)))

(insn 14 20 0 2 (use (reg/i:HI 24 r24)) add.c:4 -1
     (nil))

IRA writes:

      Pushing a0(r45,l0)(cost 0)
      Popping a0(r45,l0)  -- assign reg 18
Disposition:
    0:r45  l0    18

i.e. it assigns pseudo HI:45 to hard register HI:18 and thus causes inefficient
code because it happily moves values around without need.

.reload generates additional move insns to satisfy the constraints of addhi3
which are basically "=r, %0, rn" i.e. addition is a 2-operand insn where op0
and op1 must be in the same hard register:

(insn 23 3 6 2 (set (reg:HI 18 r18 [45])
        (reg:HI 24 r24 [ val ])) add.c:3 22 {*movhi}
     (nil))

(insn 6 23 19 2 (parallel [
            (set (reg:HI 18 r18 [45])
                (plus:HI (reg:HI 18 r18 [45])
                    (const_int 1 [0x1])))
            (clobber (scratch:QI))
        ]) add.c:3 42 {addhi3_clobber}
     (nil))

(insn 19 6 20 2 (set (reg:QI 24 r24)
        (reg:QI 18 r18 [45])) add.c:4 18 {movqi_insn}
     (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
        (reg:QI 19 r19 [+1 ])) add.c:4 18 {movqi_insn}
     (nil))


However, the machine could just as well do the addition in HI:24 directly like so:

(parallel [(set (reg:HI 24 r24)
                (plus:HI (reg:HI 24)
                         (const_int 1)))
           (clobber (scratch:QI))])  {addhi3_clobber}

The code above is just a small example to show the problem, but the issue also
occurs with more complex code and not only for return and parameter registers.

== Command line ==

avr-gcc add.c -c -mmcu=avr4 -Os -save-temps -dp -da

== configure ==

../../gcc.gnu.org/trunk/configure --target=avr --prefix=/local/gnu/install/gcc-4.7 --disable-nls --enable-languages=c,c++ --with-dwarf2 --enable-checking=yes,rtl

Thread model: single
gcc version 4.7.0 20120206 (experimental) (GCC)
Comment 1 Georg-Johann Lay 2012-02-16 14:00:59 UTC
Created attachment 26677 [details]
add.c
Comment 2 Georg-Johann Lay 2012-02-16 14:03:05 UTC
Created attachment 26678 [details]
add.s

Assembler output with -c -mmcu=avr4 -Os -save-temps -dp -da

To see reasonable code, add -fno-split-wide-types
Comment 3 Georg-Johann Lay 2012-02-16 14:04:02 UTC
Created attachment 26679 [details]
add.c.197r.ira
Comment 4 Georg-Johann Lay 2012-02-16 14:06:43 UTC
Created attachment 26680 [details]
add.c.198r.reload
Comment 5 Wouter van Gulik 2012-07-27 17:23:59 UTC
Note that the same behavior is seen for pointers and long.

long add(long l)
{
    return l + 1;
}

and

char* add(char* p)
{
    return p + 1;
}
Comment 6 Georg-Johann Lay 2012-10-21 20:54:12 UTC
*** Bug 47644 has been marked as a duplicate of this bug. ***
Comment 7 Richard Biener 2013-04-11 07:58:44 UTC
GCC 4.7.3 is being released, adjusting target milestone.
Comment 8 Richard Biener 2014-06-12 13:41:01 UTC
The 4.7 branch is being closed, moving target milestone to 4.8.4.
Comment 9 Georg-Johann Lay 2014-10-07 17:13:35 UTC
Works for me in 4.8 and 4.9.