reload bugfix

Jeffrey A Law law@cygnus.com
Sat Aug 19 11:38:00 GMT 2000


Aldy -- it's probably worth your time to walk through this -- while parts of
it may be over your head right now, it may give you some insights into
reload's operation.

execute/950704-1.c is failing on an ARM variant that Red Hat is adding to
GCC.


In the .lreg dump we have the following insn:

(insn 36 35 37 (parallel[
            (set (reg:CC_NOOV 24 cc)
                (compare:CC_NOOV (ior:SI (lshiftrt:SI (subreg:SI (reg/v:DI 49) 
1)
                            (const_int 31 [0x1f]))
                        (reg:SI 55))
                    (const_int 0 [0x0])))
            (clobber (scratch:SI))
        ] ) 308 {*arith_shiftsi_compare0_scratch} (insn_list 33 (nil))
    (expr_list:REG_UNUSED (scratch:SI)
        (expr_list:REG_DEAD (reg:SI 55)
            (expr_list:REG_DEAD (reg/v:DI 49)
                (nil)))))

local-alloc & global-alloc finish, then reload starts...  Fun Fun...

We get the following reloads for insn 36:
Reloads for insn # 36
Reload 0: GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
        reload_out_reg: (scratch:SI)
Reload 1: reload_in (DI) = (reg/v:DI 30 mv3)
        GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 4)
        reload_in_reg: (reg/v:DI 30 mv3)
        reload_reg_rtx: (reg:DI 0 r0)
Reload 2: reload_in (SI) = (subreg:SI (reg/v:DI 30 mv3) 1)
        reload_out (SI) = (scratch:SI)
        GENERAL_REGS, RELOAD_OTHER (opnum = 4)
        reload_in_reg: (subreg:SI (reg/v:DI 30 mv3) 1)
        reload_out_reg: (scratch:SI)
        reload_reg_rtx: (reg:SI 2 r2)

Reload #1 is for our scratch register.

Reload #2 is reloading the inner part of the subreg expression because
(reg:DI mv3) is larger than a target word but mv3 holds the entire DImode
value in a single register.

Reload #3 is for the entire subreg expression.  Presumably because the
insn's constraints do not allow mv3.

The insn chain looks like this immediately after insertion of the reload
insns (before replacements):

(insn 109 35 106 (set (reg:SI 2 r2)
        (subreg:SI (reg/v:DI 30 mv3) 1)) -1 (nil)
    (nil))

(insn 106 109 36 (set (reg:DI 0 r0)
        (reg/v:DI 30 mv3)) -1 (nil)
    (nil))

(insn 36 106 37 (parallel[ 
            (set (reg:CC_NOOV 24 cc)
                (compare:CC_NOOV (ior:SI (lshiftrt:SI (subreg:SI (reg/v:DI 30 mv3) 1)
                            (const_int 31 [0x1f]))
                        (reg:SI 3 r3))
                    (const_int 0 [0x0])))
            (clobber (scratch:SI))
        ] ) 308 {*arith_shiftsi_compare0_scratch} (insn_list 33 (nil))
    (expr_list:REG_UNUSED (reg:SI 2 r2)
        (expr_list:REG_DEAD (reg:SI 3 r3)
            (expr_list:REG_DEAD (reg/v:DI 30 mv3)
                (nil)))))

Seems reasoanble, huh?

subst_reloads turns that into:

(insn 109 35 106 (set (reg:SI 2 r2)
        (subreg:SI (reg:DI 0 r0) 1)) -1 (nil)
    (nil))

(insn 106 109 36 (set (reg:DI 0 r0)
        (reg/v:DI 30 mv3)) -1 (nil)
    (nil))

(insn 36 106 37 (parallel[ 
            (set (reg:CC_NOOV 24 cc)
                (compare:CC_NOOV (ior:SI (lshiftrt:SI (reg:SI 2 r2)
                            (const_int 31 [0x1f]))
                        (reg:SI 3 r3))
                    (const_int 0 [0x0])))
            (clobber (reg:SI 2 r2))
        ] ) 308 {*arith_shiftsi_compare0_scratch} (insn_list 33 (nil))
    (expr_list:REG_UNUSED (reg:SI 2 r2)
        (expr_list:REG_DEAD (reg:SI 3 r3)
            (expr_list:REG_DEAD (reg/v:DI 30 mv3)
                (nil)))))

Eeek.  This is bad -- the value in r0 is not what we want as the source for
insn 109 since r0 hasn't been assigned yet!

The reload which created insn 109 was created inside push_reload by this code:

  if (in != 0 && GET_CODE (in) == SUBREG
      && (CONSTANT_P (SUBREG_REG (in))
          || (GET_CODE (SUBREG_REG (in)) == REG
              && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER
              && (! HARD_REGNO_MODE_OK (REGNO (SUBREG_REG (in))
                                        + SUBREG_WORD (in),
                                        inmode)
                  || (GET_MODE_SIZE (inmode) <= UNITS_PER_WORD
                      && (GET_MODE_SIZE (GET_MODE (SUBREG_REG (in)))
                          > UNITS_PER_WORD)
                      && ((GET_MODE_SIZE (GET_MODE (SUBREG_REG (in)))
                           / UNITS_PER_WORD)
                          != HARD_REGNO_NREGS (REGNO (SUBREG_REG (in)),
                                               GET_MODE (SUBREG_REG (in)))))))))
    {
      /* This relies on the fact that emit_reload_insns outputs the
         instructions for input reloads of type RELOAD_OTHER in the same
         order as the reloads.  Thus if the outer reload is also of type
         RELOAD_OTHER, we are guaranteed that this inner reload will be
         output before the outer reload.  */
      push_reload (SUBREG_REG (in), NULL_RTX, &SUBREG_REG (in), NULL_PTR,
                   find_valid_class (inmode, SUBREG_WORD (in)),
                   VOIDmode, VOIDmode, 0, 0, opnum, type);
      dont_remove_subreg = 1;    }

At this point TYPE is RELOAD_FOR_INPUT, which means that the inner and
outer parts of the subreg will have type RELOAD_FOR_INPUT.  Seems reasonable
since that would result in the inner part of the subreg being handled first,
then the outer part.   But obviously that's not happening -- someone
modifies the type of the reload for the outer part of the subreg.

combine_reloads combines reload #1 and #3, which results in the type
for reload #3 turning into RELOAD_OTHER. 

Now we've got the inner part as a RELOAD_INPUT (insn 106) and the outer part
of the SUBREG as RELOAD_OTHER (insn 109).

RELOAD_OTHER reloads are output before RELOAD_INPUT reloads, resulting in
incorrect code -- we needed to have the inner part handled first.


I see two obvious ways to fix this problem.

First, we could have combine_reloads not combine reloads if the input
reload is a subreg in which the inner part will need reloading.

Another would be to simply widen the scope of the inner reload to
RELOAD_OTHER in push_reload.

I've done both.  Both solve the problem.  I'm just unsure which is the
better solution from a code generation standpoint.

jeff





More information about the Gcc-patches mailing list