PR 35232: Recent regression due to reload inheritance bug

Thu Mar 13 18:03:00 GMT 2008

Hi Ulrich,

Thanks for the review, and for wading through this horrible code with me.

"Ulrich Weigand" <uweigand@de.ibm.com> writes:
> emit_input_reload_insns has an initialization of reloadreg:
>    rtx reloadreg = rl->reg_rtx;
> which is now dead and should be removed.

Oops!  Good catch, thanks.

>>@@ -7194,7 +7162,7 @@ emit_output_reload_insns (struct insn_ch
>> 		     made; leave new_spill_reg_store alone.  */
>> 		  ;
>> 		else if (s >= 0
>>-			 && SET_SRC (set) == rl->reg_rtx
>>+			 && SET_SRC (set) == rl_reg_rtx
>> 			 && SET_DEST (set) == rld[s].reg_rtx)
>
> What about cases where the secondary reload register also had
> its mode adjusted?  Can this actually happen?

Well, the following adjustment to the testcase creates an in/out
reload with miosmatched modes and secondary reloads in each
direction:

-------------------------------------------------------------------------
/* { dg-do run } */
/* { dg-mips-options "-O" } */

unsigned int
f1 (void)
{
  register unsigned long long x asm ("$f4");
  register unsigned int r asm ("$f2");
  asm ("# %0" : "=f" (x));
  asm ("# %0" : "=a" (r) : "0" (x));
  asm ("# %0" : "=h" (r) : "0" (r));
  return r;
}

int
main (void)
{
  return f1 () != 4;
}
-------------------------------------------------------------------------

(This is a 64-bit testcase, whereas the original miscompilation was
32-bit only.  I can't think of a way to get the same kind of secondary
reloads in 32-bit mode.)

But (as one would hope) the secondary reloads have the natural modes
for their respective directions:

Reloads for insn # 6
Reload 0: GR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0), can't combine, secondar
y_reload_p
        reload_reg_rtx: (reg:DI 2 $2)
Reload 1: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, seconda
ry_reload_p
        reload_reg_rtx: (reg:SI 2 $2)
Reload 2: reload_in (DI) = (reg/v:DI 36 $f4 [ x ])
        reload_out (SI) = (reg/v:SI 34 $f2 [ r ])
        ACC_REGS, RELOAD_OTHER (opnum = 0)
        reload_in_reg: (reg/v:DI 36 $f4 [ x ])
        reload_out_reg: (reg/v:SI 34 $f2 [ r ])
        reload_reg_rtx: (reg:DI 64 hi)
        secondary_in_reload = 0, secondary_out_reload = 1

AFAICT, push_secondary_reload always chooses the inmode of the parent
reload for inputs and the outmode of the parent reload for outputs,
so I'm not sure under what conditions:

	      if (GET_MODE (reloadreg) != mode)
		reloadreg = reload_adjust_reg_for_mode (reloadreg, mode);

would trigger.  (This call is only used for secondary reload moves,
not target-specific insns, and has been there in one form or another
since the beginning of the CVS history.)

As written, the final "else" of that statement is already required to
cope correctly with "s >= 0".  There is no guarantee that a secondary
reload will be either:

  (a) (set rld[s].reg_rtx rl_reg_rtx) or
  (b) not a single set at all

E.g., for MIPS, the secondary reload is actually a (set ... (unspec ...)).
reload_out* define_expands could also emit a SET other than (a).  So on
that basis:

  - The "else if" branch above should just be an optimisation

  - If the code is wrong, it shouldn't be any more wrong after my patch.
    The patch doesn't change the situations in which we adjust the
    secondary reload register (which, AFAICT, we probably never need to).

I agree -- if this is what you're thinking -- that it's far from obvious
that the original code is correct.  I assume the "not a single SET" part
exists because a reload_out* pattern might use UNSPEC_VOLATILEs, or some
other such thing whose semantics cannot be predicted by reload, and we
do not want to risk deleting it even if the result of the output reload
is found to be dead.  This should not happen for a normal move, whose
semantics ought to be just that: a move.  But if that's true, what about
reload_out* expanders that create a (set ... (unspec_volatile ...))?
And why do we not store to new_spill_reg_store[REGNO (rl_reg_rtx)] in
the "else if" you quote above, if the "else" really is correct for
"s >= 0" single sets?

However, going back to the "this isn't worse or better after my patch",
I think it would be best to deal with this as a separate issue.
Especially given that I'm still hoping to get this patch into 4.3. ;)

>
>>   old = rl->out_reg;
>>   if (old == 0
>>-      || rl->reg_rtx == old
>>-      || rl->reg_rtx == 0)
>>+      || reg_rtx == old
>>+      || reg_rtx == 0)
>>     return;
>
> I'm wondering whether == compares actually work here (and elsewhere),
> as reload_adjust_reg_for_mode may allocate fresh rtx's ...

I wondered about that too, but I don't think "rl->reg_rtx != old" ever
implied "!rtx_equal_p (rl->reg_rtx, old)" before the patch.  I think
this is just an "early out" for speed reasons, in which case using
rtx_equal_p might defeat the object.

>>@@ -7414,9 +7449,13 @@ emit_reload_insns (struct insn_chain *ch
>> 
>>   for (j = 0; j < n_reloads; j++)
>>     {
>>-      if (rld[j].reg_rtx
>>-	  && REGNO (rld[j].reg_rtx) < FIRST_PSEUDO_REGISTER)
>>-	new_spill_reg_store[REGNO (rld[j].reg_rtx)] = 0;
>>+      if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
>>+	{
>>+	  unsigned int i;
>>+
>>+	  for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
>>+	    new_spill_reg_store[i] = 0;
>>+	}
>
> Maybe just memset'ing new_spill_reg_store to zero at the start
> of emit_reload_insns would be clearer?

TBH, I'd wondered about that too, but the original authors obviously
wanted to touch as little of the new_spill_reg_store array as possible.
I decided in the end to avoid changing peripheral decisions like that
as part of this patch.  (Again, this was partly a "for 4.3" thing.)

I suppose the thinking was that it was better not to dirty the whole
new_spill_reg_store array on targets with large register files when
only a very small number are actually set or used.

>>+	  /* Maybe the spill reg contains a copy of reload_out.  */
>>+	  if (rld[r].out != 0
>>+	      && (REG_P (rld[r].out)
>> #ifdef AUTO_INC_DEC
>>-		      || ! rld[r].out_reg
>>+		  || ! rld[r].out_reg
>> #endif
>>-		      || REG_P (rld[r].out_reg)))
>>+		  || REG_P (rld[r].out_reg)))
>>+	    {
>>+	      rtx src;
>>+	      enum machine_mode mode;
>>+
>>+	      src = reload_reg_rtx_for_output[r];
>>+	      mode = GET_MODE (src);
>>+	      i = REGNO (src);
>>+	      nr = hard_regno_nregs[i][mode];
>
> Redefining i and nr in the middle of this function strikes me as a
> bit confusing, maybe it would be better to use different variable
> names here ...

OK, I'll change it to use new local variables.  (I'm afraid this
was another sop to 4.3, to reduce the number of changes neeeded.
The justification was that the variables are still holding the same
conceptual information; we're just narrowing the range for the case
in hand.)

>>+	  /* Maybe the spill reg contains a copy of reload_in.  Only do
>>+	     something if there will not be an output reload for
>>+	     the register being reloaded.  */
>>+	  else if (rld[r].out_reg == 0
>>+		   && rld[r].in != 0
>>+		   && ((REG_P (rld[r].in)
>>+			&& !HARD_REGISTER_P (rld[r].in)
>>+			&& !REGNO_REG_SET_P (&reg_has_output_reload,
>>+					     REGNO (rld[r].in)))
>>+		       || (REG_P (rld[r].in_reg)
>>+			   && !REGNO_REG_SET_P (&reg_has_output_reload,
>>+						REGNO (rld[r].in_reg))))
>>+		   && !reg_set_p (reload_reg_rtx_for_input[r], PATTERN (insn)))
>>+	    {
>>+	      rtx dest;
>>+	      enum machine_mode mode;
>>+
>>+	      dest = reload_reg_rtx_for_input[r];
>
> Why do you call the variable "dest" here and not "src" as above?
> It is still the (potential) source of a future reload inheritance ...

If you look at each reload as a black box, it's really just a move, and
this code is trying to record that move for later.  So I was naming SRC
and DEST after their position in that move.

>>@@ -7687,17 +7718,18 @@ emit_reload_insns (struct insn_chain *ch
>> 	 it thinks only about the original insn.  So invalidate it here.
>> 	 Also do the same thing for RELOAD_OTHER constraints where the
>> 	 output is discarded.  */
>>-      if (i < 0 
>>-	  && ((rld[r].out != 0
>>-	       && (REG_P (rld[r].out)
>>-		   || (MEM_P (rld[r].out)
>>+      else if (i < 0
>>+	       && ((rld[r].out != 0
>>+		    && (REG_P (rld[r].out)
>>+			|| (MEM_P (rld[r].out)
>>+			    && REG_P (rld[r].out_reg))))
>
> Why is the "else" necessary?

That was part of the (rather misguided?) reassignments of "i" and "nr".
Changing this to "else" meant that we'd never refer to the original
"i" after "narrowing" it.  It wouldn't have made any difference in
practice; I just thought this was clearer.  (I'd argue it's a little
clearer on its own, but without the additional motivation of the
in-place narrowing, I'm happy to leave it out.)

Does the rest of the message convince you?  If so, I'll prepare
a version of the patch with new local vars and without the
s/if/else if/.

Richard