[PATCH] aarch64: Don't generate invalid zero/sign-extend syntax
Fri Aug 21 09:28:13 GMT 2020
Alex Coplan <Alex.Coplan@arm.com> writes:
> Hi Richard,
>> -----Original Message-----
>> From: Richard Sandiford <firstname.lastname@example.org>
>> Sent: 18 August 2020 09:35
>> To: Alex Coplan <Alex.Coplan@arm.com>
>> Cc: email@example.com; Richard Earnshaw <Richard.Earnshaw@arm.com>;
>> Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov
>> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend
>> Alex Coplan <firstname.lastname@example.org> writes:
>> > Note that an obvious omission here is that this patch does not touch
>> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
>> > that I couldn't hit these patterns with C code since multiplications by
>> > powers of two always get turned into shifts by earlier RTL passes. If
>> > there's a way to reliably hit these patterns, then perhaps these should
>> > be updated as well.
>> Hmm. Feels like we should either update them or delete them. E.g.:
>> were added alongside the adds3.c and subs3.c tests that you're updating,
>> so if the tests don't/no longer need the multp2 patterns to pass,
>> there's a good chance that the patterns are redundant.
>> For reasons I never understood, the canonical representation is to use
>> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
>> So perhaps the patterns were originally for address calculations that had
>> been moved outside of a (mem …) and not updated to shifts instead of
> Thanks for the review, and for clarifying this. I tried removing these
> together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and
> the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when
> compiled with -Os -ftree-vectorize which appears to depend on the
> *add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this
> In this case, GCC appears to have done exactly what you described: we
> have the (plus (mult ...) ...) nodes inside (mem)s prior to register
> allocation, and then we end up pulling these out without converting them
> to shifts.
> Seeing this behaviour (and in particular seeing the ICE) makes me
> hesitant to just go ahead and remove the other patterns. That said, I
> have a patch to remove the following patterns:
> (together with the predicate aarch64_pwr_imm3 which is only used in
> these patterns) and this bootstraps/regtests just fine.
> So, I have a couple of questions:
> (1) Should it be considered a bug if we pull (plus (mult (power of 2)
> ...) ...) out of a (mem) RTX without re-writing the (mult) as a
IMO, yes. But if we have an example in which it happens, we have
to fix it before removing the patterns. That could end up being
a bit of a rabbit hole, and could affect other targets too.
If we keep the patterns, we should fix the [su]xtw problem in:
too. (Plus any others I missed, if that isn't the full list.)
More information about the Gcc-patches