This is the mail archive of the
mailing list for the GCC project.
Re: [RFC][PATCH,reload] PR middle-end/42431, fix latent bug in gen_reload_chain_without_interm_reg_p()
On Thu, Feb 25, 2010 at 7:16 PM, Jeff Law <firstname.lastname@example.org> wrote:
> On 02/25/10 03:44, Richard Guenther wrote:
>>> I think we can have an embedded const within a mem... ?ie, the longer rtx
>>> might look like
>>> (plus (mem (const (plus (symbol_ref) (const_int))) (const_int)
>>> Which could lead to the exact same problem we're seeing right now. ?So I
>>> the following solutions.
>>> ?1. Create a true scratch copy.
>>> ?2. Undo the substitution. ?This wouldn't be that hard. ?We just
>>> a record of the substitutions made by substitute and use those records to
>>> undo the substitution.
>>> ?3. Find some way to rethink the code not to need the substitution.
>>> I didn't come up with anything clean for #3.
>>> #1 is very easy, but will generate more garbage RTL which we can't free
>>> until reload is complete. ?Depending on how often these checks are being
>>> performed, it could be a substantial memory leak for the duration of
>>> #2 is relatively simple I think. ?I think recording in substitute is
>>> than just replacing references to the reload reg with the original value
>>> rld.in, just in case the reload reg is used for some other purpose in
>> I would indeed like #3 most (or alternatively deep-copy shared rtx
>> at substitution time to avoid memory issues with #1). ?That would
>> require re-structuring of substitute() to for example get an additional
>> pointer to the root of a shared rtx we encounter during recursing.
>> When we'd about to substitute with a non-NULL root we'd
>> create a true copy of it, replace the root with it and start over
>> recursing with the copy (but setting the root parm to NULL).
>> Maybe that way we could even avoid the unconditional copy_rtx
>> in gen_reload_chain_without_interm_reg_p? (though that would
>> require measuring if its worth it)
>> #2 adds more code to code that we want to go away ...
> I just don't see how we're going to get #3. ?What we need to do is build up
> an insn and see if it's recognized so that we can predict the behavior of a
> later part of reload (which builds up an insn and tries to recognize it) and
> adjust accordingly (hence my comment about why all this needs to go away
> #1 is clearly throw-away code. ?Nothing else I'm aware of needs to be able
> to make copies of shared RTL. ? ?However, it has the advantage of being
> extremely simple. ?We just introduce a new copy_rtx variant that blindly
> copies everything and use that variant in this code. ?Doing a hybrid where
> we use the normal copy_rtx and backtrack if we wanted to substitute into
> something that was shared is just fugly.
> #2 really isn't that hard. ?We just have a vector of locations where we made
> a substitution, then we back them out when the routine is complete. ?In
> fact, we don't have to generate gobs of RTL for this case. ? I think it
> looks something like the attached patch. ? ?In fact, we can probably remove
> the copy_rtx call in this case.
I like it.