This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: weird impact of lower-subreg on IRA/reload

On 02/15/2012 09:21 AM, Georg-Johann Lay wrote:
This is a question on SUBREGs generated by lower-subreg.c and whether register
allocator is supposed to handle them efficiently.

Suppose the following small function compiled for AVR.
Remember AVR is 8-bit machine with int = HImode and UNITS_PER_WORD = 1.

int add (int val)
     return val + 1;

The addition can be performed in one insn; val and return value are passed in
HI:24 as you can see in .ira dump:

(insn 6 3 19 2 (parallel [ (set (reg:HI 45) (plus:HI (reg:HI 24 r24 [ val ]) (const_int 1 [0x1]))) (clobber (scratch:QI)) ]) add.c:3 42 {addhi3_clobber} (expr_list:REG_DEAD (reg:HI 24 r24 [ val ]) (nil)))

(insn 19 6 20 2 (set (reg:QI 24 r24)
         (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
         (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
      (expr_list:REG_DEAD (reg:HI 45)

(insn 14 20 0 2 (use (reg/i:HI 24 r24)) add.c:4 -1

IRA writes:

       Pushing a0(r45,l0)(cost 0)
       Popping a0(r45,l0)  -- assign reg 18
     0:r45  l0    18

i.e. it assigns pseudo HI:45 to hard register HI:18 and thus causes inefficient
code because it happily moves values around without need.

.reload generates additional move insns to satisfy the constraints of addhi3
which are basically "=r, %0, rn" i.e. addition is a 2-operand insn where op0
and op1 must be in the same hard register:

(insn 23 3 6 2 (set (reg:HI 18 r18 [45])
         (reg:HI 24 r24 [ val ])) add.c:3 22 {*movhi}

(insn 6 23 19 2 (parallel [
             (set (reg:HI 18 r18 [45])
                 (plus:HI (reg:HI 18 r18 [45])
                     (const_int 1 [0x1])))
             (clobber (scratch:QI))
         ]) add.c:3 42 {addhi3_clobber}

(insn 19 6 20 2 (set (reg:QI 24 r24)
         (reg:QI 18 r18 [45])) add.c:4 18 {movqi_insn}

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
         (reg:QI 19 r19 [+1 ])) add.c:4 18 {movqi_insn}

However, the machine could just as well do the addition in HI:24 directly like so:

(parallel [(set (reg:HI 24 r24) (plus:HI (reg:HI 24) (const_int 1))) (clobber (scratch:QI))]) {addhi3_clobber}

Question: Is IRA supposed to detect SUBREGs like above and avoid code bloat? Sequences like

(insn 19 6 20 2 (set (reg:QI 24 r24) (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn} (nil))

(insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
         (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
      (expr_list:REG_DEAD (reg:HI 45)

obviously generate some early-clobber situation for IRA that avoids HI:45 to be
allocated to HI:24.

Is IRA a school book implementation that does not know anything about SUBREGs?
No, it is not a school book implementation.
Or should IRA be smart enough to detect and allocate SUBREGs efficiently by
some "subreg fusion" mechanism?
No, it is not smart enough.

IRA deals well with subregs of multi-register pseudos but not with subregs of one-register pseudos.

By the way, the old register allocator did not deal with subregs at all.
The code above is just a small example to show the problem, but the issue also
occurs with more complex code and not only for return and parameter registers.

Thanks for reporting this. I might be work on this. But I don't know when I can start. This platform is not on my high priority list.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]