[PATCH] Fix up RTL DCE find_call_stack_args (PR rtl-optimization/89965)
Fri Apr 12 15:29:00 GMT 2019
On 4/11/19 1:54 AM, Richard Biener wrote:
> On Thu, 11 Apr 2019, Jakub Jelinek wrote:
>> The following testcase is miscompiled, because the result of
>> a DImode (doubleword) right shift is used in a QImode subreg as well as
>> later on pushed into a stack slot as an argument to a const function
>> whose result isn't used. The RA because of the weirdo target tuning
>> reuses the REG_EQUIV argument slot (as it wants the value in a non-Q_REGS
>> register), so it first pushes it to the stack slot and then loads from the
>> stack slot again (according to Vlad, this can happen with both LRA and old
>> reload). Later on during DCE we determine the call is not needed and try to
>> find all the argument pushes and don't mark those as needed, so we
>> effectively DCE the right shift, push to the argument slot as well as
>> other slots and the call, and end up keeping just the load from the
>> uninitialized argument slot.
>> The following patch just punts if we find loads from stack slots in between
>> where they are pushed and the const/pure call. In addition to that, I've
>> outlined the same largish sequence that had 3 copies in the function already
>> and I'd need to add 4th copy, so in the end the dce.c changes are removing
>> more than adding:
>> 1 file changed, 118 insertions(+), 135 deletions(-)
> The refactoring itself is probably OK (and if done separately
> makes reviewing easier).
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>> During those 2 bootstraps/regtests, data.load_found has been set just
>> on the new testcase on ia32.
> Hmm, I wonder whether we really need to DCE calls after reload?
> That said, I'm not familiar enough with the code to check if the
> patch makes sense (can there ever be uses of the argument slots
> _after_ the call?).
We've been through the "who owns the argument slots, caller or callee"
discussion a few times and I don't think we've ever reached any kind of
conclusion in the general case.
I think this case side-steps the general case.
We've got an argument slot for a const/pure call. Because of the
const/pure designation the caller can assume the callee doesn't modify
the argument slot. That may in turn allow the caller to place a
REG_EQUIV note on the store to the slot.
Once that happens we can replace one form with the other. So we could
well end up with a use of the slot after the call.
More information about the Gcc-patches