[RFC][PATCH,reload] PR middle-end/42431, fix latent bug in gen_reload_chain_without_interm_reg_p()

Fri Feb 26 10:48:00 GMT 2010

On Thu, Feb 25, 2010 at 7:16 PM, Jeff Law <law@redhat.com> wrote:
> On 02/25/10 03:44, Richard Guenther wrote:
>>
>>> I think we can have an embedded const within a mem...  ie, the longer rtx
>>> might look like
>>>
>>> (plus (mem (const (plus (symbol_ref) (const_int))) (const_int)
>>>
>>>
>>>
>>> Which could lead to the exact same problem we're seeing right now.  So I
>>> see
>>> the following solutions.
>>>
>>>
>>>  1. Create a true scratch copy.
>>>
>>>  2. Undo the substitution.  This wouldn't be that hard.  We just
>>> accumulate
>>> a record of the substitutions made by substitute and use those records to
>>> undo the substitution.
>>>
>>>  3. Find some way to rethink the code not to need the substitution.
>>>
>>>
>>>
>>> I didn't come up with anything clean for #3.
>>>
>>> #1 is very easy, but will generate more garbage RTL which we can't free
>>> until reload is complete.  Depending on how often these checks are being
>>> performed, it could be a substantial memory leak for the duration of
>>> reload.
>>>
>>> #2 is relatively simple I think.  I think recording in substitute is
>>> safer
>>> than just replacing references to the reload reg with the original value
>>> of
>>> rld[2].in, just in case the reload reg is used for some other purpose in
>>> the
>>> insn.
>>>
>>
>> I would indeed like #3 most (or alternatively deep-copy shared rtx
>> at substitution time to avoid memory issues with #1).  That would
>> require re-structuring of substitute() to for example get an additional
>> pointer to the root of a shared rtx we encounter during recursing.
>> When we'd about to substitute with a non-NULL root we'd
>> create a true copy of it, replace the root with it and start over
>> recursing with the copy (but setting the root parm to NULL).
>> Maybe that way we could even avoid the unconditional copy_rtx
>> in gen_reload_chain_without_interm_reg_p? (though that would
>> require measuring if its worth it)
>>
>> #2 adds more code to code that we want to go away ...
>>
>
> I just don't see how we're going to get #3.  What we need to do is build up
> an insn and see if it's recognized so that we can predict the behavior of a
> later part of reload (which builds up an insn and tries to recognize it) and
> adjust accordingly (hence my comment about why all this needs to go away
> someday).
>
> #1 is clearly throw-away code.  Nothing else I'm aware of needs to be able
> to make copies of shared RTL.    However, it has the advantage of being
> extremely simple.  We just introduce a new copy_rtx variant that blindly
> copies everything and use that variant in this code.  Doing a hybrid where
> we use the normal copy_rtx and backtrack if we wanted to substitute into
> something that was shared is just fugly.
>
> #2 really isn't that hard.  We just have a vector of locations where we made
> a substitution, then we back them out when the routine is complete.  In
> fact, we don't have to generate gobs of RTL for this case.   I think it
> looks something like the attached patch.    In fact, we can probably remove
> the copy_rtx call in this case.

I like it.

Richard.

> jeff
>
>
>
>
>
>