[RFC][PATCH,reload] PR middle-end/42431, fix latent bug in gen_reload_chain_without_interm_reg_p()

Peter Bergner bergner@vnet.ibm.com
Wed Feb 24 19:51:00 GMT 2010


There is a latent bug in gen_reload_chain_without_interm_reg_p() which is
exposed by the testcase in PR42431.  Before IRA, we have the following insns:

(insn 203 2 213 2 (set (reg/f:DI 256 [ vect_pk.44 ])
        (plus:DI (reg/f:DI 255)
            (const_int 16 [0x10]))) 83 {*adddi3_internal1} (expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR2") [flags 0x182]) (const_int 16 [0x10])))
        (nil)))
 ...
(insn 21 214 25 2 pr42431.f:27 (set (reg:DI 270)
        (const_int 16 [0x10])) 371 {*movdi_internal64} (nil))
...
(insn 20 17 23 3 pr42431.f:27 (set (mem:V4SI (reg/f:DI 256 [ vect_pk.44 ]) [4 S16 A128])
        (reg:V4SI 269)) 926 {*altivec_movv4si} (expr_list:REG_EQUAL (const_vector:V4SI [
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
            ])
        (nil)))

(insn 23 20 232 3 pr42431.f:27 (set (mem:V4SI (plus:DI (reg/f:DI 256 [ vect_pk.44 ])
                (reg:DI 270)) [4 S16 A128])
        (reg:V4SI 269)) 926 {*altivec_movv4si} (expr_list:REG_EQUAL (const_vector:V4SI [
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
                (const_int 0 [0x0])
            ])
        (nil)))

During IRA, pseudos 256 and 270 are marked for spilling and we eventually
call into gen_reload_chain_without_interm_reg_p() with the following two
"chained" reloads:

  *) (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR2") [flags 0x182]) (const_int 16 [0x10])))
  *) (const_int 16 [0x10])

As part of its analysis, g_r_c_w_i_r_p() creates a scratch copy of the
larger reload rtx with copy_rtx() so that it can modify it to run some
tests on, before throwing it away.  The bug is that g_r_c_w_i_r_p()
expects copy_rtx() to return a true scratch copy we can modify at will and
in any way we wish with no effects on the program's rtl.  However, copy_rtx()
doesn't copy rtxs that are sharable, instead it just returns the object
we passed in.  In this specific case, we pass in a CONST rtx which is
sharable, so it just returns without making a copy which means we end up
modifying the program's rtl rather than some scratch rtl.

At first, I thought if I added an early exit for reloads that are CONST rtxs
like the patch below, that it would solve the problem:

    --- reload1.c   (revision 156816)
    +++ reload1.c   (working copy)
    @@ -5221,6 +5221,10 @@ gen_reload_chain_without_interm_reg_p (i
           r1 = r2;
           r2 = n;
         }
    +
    +  if (GET_CODE (rld[r1].in) == CONST)
    +    return true;
    +
       gcc_assert (reg_mentioned_p (rld[r2].in, rld[r1].in));
       regno = rld[r1].regno >= 0 ? rld[r1].regno : rld[r2].regno;
       gcc_assert (regno >= 0);

...and it does for this specific test case (actually, it bootstrapped and
regtested on powerpc64-linux with no regressions).  However, it doesn't
handle a CONST rtx that is buried within the reload and it doesn't handle
other rtx types that are sharable.  Maybe those cases aren't possible?

Does anyone have an opinion on how this should be fixed?  Is the patch
above OK, or should we create a copy_rtx_unshared() function that truly
does create a copy of the reload rtx with no sharing?  Or a something all
together different type of fix?

Peter






More information about the Gcc-patches mailing list