This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][PATCH,reload] PR middle-end/42431, fix latent bug in gen_reload_chain_without_interm_reg_p()


On Thu, Feb 25, 2010 at 6:44 AM, Jeff Law <law@redhat.com> wrote:
> On 02/24/10 11:59, Peter Bergner wrote:
>>
>> During IRA, pseudos 256 and 270 are marked for spilling and we eventually
>> call into gen_reload_chain_without_interm_reg_p() with the following two
>> "chained" reloads:
>>
>> ? *) (const:DI (plus:DI (symbol_ref:DI ("*.LANCHOR2") [flags 0x182])
>> (const_int 16 [0x10])))
>> ? *) (const_int 16 [0x10])
>>
>> As part of its analysis, g_r_c_w_i_r_p() creates a scratch copy of the
>> larger reload rtx with copy_rtx() so that it can modify it to run some
>> tests on, before throwing it away. ?The bug is that g_r_c_w_i_r_p()
>> expects copy_rtx() to return a true scratch copy we can modify at will and
>> in any way we wish with no effects on the program's rtl. ?However,
>> copy_rtx()
>> doesn't copy rtxs that are sharable, instead it just returns the object
>> we passed in. ?In this specific case, we pass in a CONST rtx which is
>> sharable, so it just returns without making a copy which means we end up
>> modifying the program's rtl rather than some scratch rtl.
>>
>> At first, I thought if I added an early exit for reloads that are CONST
>> rtxs
>> like the patch below, that it would solve the problem:
>>
>> ? ? --- reload1.c ? (revision 156816)
>> ? ? +++ reload1.c ? (working copy)
>> ? ? @@ -5221,6 +5221,10 @@ gen_reload_chain_without_interm_reg_p (i
>> ? ? ? ? ? ?r1 = r2;
>> ? ? ? ? ? ?r2 = n;
>> ? ? ? ? ?}
>> ? ? +
>> ? ? + ?if (GET_CODE (rld[r1].in) == CONST)
>> ? ? + ? ?return true;
>> ? ? +
>> ? ? ? ?gcc_assert (reg_mentioned_p (rld[r2].in, rld[r1].in));
>> ? ? ? ?regno = rld[r1].regno>= 0 ? rld[r1].regno : rld[r2].regno;
>> ? ? ? ?gcc_assert (regno>= 0);
>>
>> ...and it does for this specific test case (actually, it bootstrapped and
>> regtested on powerpc64-linux with no regressions). ?However, it doesn't
>> handle a CONST rtx that is buried within the reload and it doesn't handle
>> other rtx types that are sharable. ?Maybe those cases aren't possible?
>>
>> Does anyone have an opinion on how this should be fixed? ?Is the patch
>> above OK, or should we create a copy_rtx_unshared() function that truly
>> does create a copy of the reload rtx with no sharing? ?Or a something all
>> together different type of fix?
>>
>
> What a mess. ? ?When I speak of hunks of reload that need to die, all the
> reload scheduling and conflict detection/resolution is high on that list
> (probably just after the reload inheritance code). ?Anyway...
>
>
> I think we can have an embedded const within a mem... ?ie, the longer rtx
> might look like
>
> (plus (mem (const (plus (symbol_ref) (const_int))) (const_int)
>
>
>
> Which could lead to the exact same problem we're seeing right now. ?So I see
> the following solutions.
>
>
> ?1. Create a true scratch copy.
>
> ?2. Undo the substitution. ?This wouldn't be that hard. ?We just accumulate
> a record of the substitutions made by substitute and use those records to
> undo the substitution.
>
> ?3. Find some way to rethink the code not to need the substitution.
>
>
>
> I didn't come up with anything clean for #3.
>
> #1 is very easy, but will generate more garbage RTL which we can't free
> until reload is complete. ?Depending on how often these checks are being
> performed, it could be a substantial memory leak for the duration of reload.
>
> #2 is relatively simple I think. ?I think recording in substitute is safer
> than just replacing references to the reload reg with the original value of
> rld[2].in, just in case the reload reg is used for some other purpose in the
> insn.

I would indeed like #3 most (or alternatively deep-copy shared rtx
at substitution time to avoid memory issues with #1).  That would
require re-structuring of substitute() to for example get an additional
pointer to the root of a shared rtx we encounter during recursing.
When we'd about to substitute with a non-NULL root we'd
create a true copy of it, replace the root with it and start over
recursing with the copy (but setting the root parm to NULL).
Maybe that way we could even avoid the unconditional copy_rtx
in gen_reload_chain_without_interm_reg_p? (though that would
require measuring if its worth it)

#2 adds more code to code that we want to go away ...

Richard.
>
> Comments?
>
> Jeff
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]