This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

weird impact of lower-subreg on IRA/reload


This is a question on SUBREGs generated by lower-subreg.c and whether register
allocator is supposed to handle them efficiently.

Suppose the following small function compiled for AVR.
Remember AVR is 8-bit machine with int = HImode and UNITS_PER_WORD = 1.

int add (int val)
{
    return val + 1;
}

The addition can be performed in one insn; val and return value are passed in
HI:24 as you can see in .ira dump:


(insn 6 3 19 2 (parallel [
            (set (reg:HI 45)
                (plus:HI (reg:HI 24 r24 [ val ])
                    (const_int 1 [0x1])))
            (clobber (scratch:QI))
        ]) add.c:3 42 {addhi3_clobber}
     (expr_list:REG_DEAD (reg:HI 24 r24 [ val ])
        (nil)))

(insn 19 6 20 2 (set (reg:QI 24 r24)
        (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}
     (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
        (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
     (expr_list:REG_DEAD (reg:HI 45)
        (nil)))

(insn 14 20 0 2 (use (reg/i:HI 24 r24)) add.c:4 -1
     (nil))

IRA writes:

      Pushing a0(r45,l0)(cost 0)
      Popping a0(r45,l0)  -- assign reg 18
Disposition:
    0:r45  l0    18

i.e. it assigns pseudo HI:45 to hard register HI:18 and thus causes inefficient
code because it happily moves values around without need.

.reload generates additional move insns to satisfy the constraints of addhi3
which are basically "=r, %0, rn" i.e. addition is a 2-operand insn where op0
and op1 must be in the same hard register:

(insn 23 3 6 2 (set (reg:HI 18 r18 [45])
        (reg:HI 24 r24 [ val ])) add.c:3 22 {*movhi}
     (nil))

(insn 6 23 19 2 (parallel [
            (set (reg:HI 18 r18 [45])
                (plus:HI (reg:HI 18 r18 [45])
                    (const_int 1 [0x1])))
            (clobber (scratch:QI))
        ]) add.c:3 42 {addhi3_clobber}
     (nil))

(insn 19 6 20 2 (set (reg:QI 24 r24)
        (reg:QI 18 r18 [45])) add.c:4 18 {movqi_insn}
     (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
        (reg:QI 19 r19 [+1 ])) add.c:4 18 {movqi_insn}
     (nil))


However, the machine could just as well do the addition in HI:24 directly like so:


(parallel [(set (reg:HI 24 r24)
                (plus:HI (reg:HI 24)
                         (const_int 1)))
           (clobber (scratch:QI))])  {addhi3_clobber}


Question: Is IRA supposed to detect SUBREGs like above and avoid code bloat?
Sequences like


(insn 19 6 20 2 (set (reg:QI 24 r24)
        (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}
     (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
        (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
     (expr_list:REG_DEAD (reg:HI 45)
        (nil)))

obviously generate some early-clobber situation for IRA that avoids HI:45 to be
allocated to HI:24.

Is IRA a school book implementation that does not know anything about SUBREGs?
Or should IRA be smart enough to detect and allocate SUBREGs efficiently by
some "subreg fusion" mechanism?

The code above is just a small example to show the problem, but the issue also
occurs with more complex code and not only for return and parameter registers.

Johann


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]