[PATCH] [PR rtl-optimization/97249]Simplify vec_select of paradoxical subreg.
Richard Sandiford
richard.sandiford@arm.com
Mon Oct 19 15:31:15 GMT 2020
Hongtao Liu <crazylht@gmail.com> writes:
> On Thu, Oct 15, 2020 at 8:38 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > + /* Simplify vec_select of a subreg of X to just a vec_select of X
>> > + when X has same component mode as vec_select. */
>> > + int l2;
>> > + if (GET_CODE (trueop0) == SUBREG
>> > + && GET_MODE_INNER (mode)
>> > + == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
>>
>> Better to use SUBREG_REG here and below.
>>
>
> Yes and changed.
>
>> > + && (GET_MODE_NUNITS (GET_MODE (trueop0))).is_constant (&l0)
>> > + && (GET_MODE_NUNITS (mode)).is_constant (&l1)
>> > + && (GET_MODE_NUNITS (GET_MODE (XEXP (trueop0, 0))))
>> > + .is_constant (&l2)
>> > + && known_le (l1, l2))
>> > + {
>> > + unsigned HOST_WIDE_INT subreg_offset = 0;
>> > + gcc_assert (known_eq (XVECLEN (trueop1, 0), l1));
>> > + gcc_assert (can_div_trunc_p (exact_div (subreg_lsb (trueop0), BITS_PER_UNIT),
>> > + GET_MODE_SIZE (GET_MODE_INNER (mode)),
>> > + &subreg_offset));
>>
>> can_div_trunc_p discards the remainder, whereas it looks like here
>> you want an exact multiple.
>>
>> I don't think it's absolutely guaranteed that the “if” condition makes
>> the division by GET_MODE_SIZE exact. E.g. in principle you could have
>> a subreg of a vector of TIs in which the subreg offset is misaligned by
>> a DI offset.
>>
>> I'm not sure the subreg_lsb conversion is correct though. On big-endian
>> targets, lane numbering follows memory layout, just like subreg byte
>> offsets do. So ISTM that using SUBREG_BYTE (as per the earlier patch)
>> was correct.
>>
>> In summary, I think the "if” condition should include something like:
>>
>> constant_mulitple_p (SUBREG_BYTE (trueop0),
>> GET_MODE_UNIT_BITSIZE (mode),
>> &subreg_offset)
>>
>
> Changed.
>
>> Thanks,
>> Richard
>
>
> Update patch.
>
> --
> BR,
> Hongtao
>
> From 8d154067963e453c337e6dc2c4f3f19bf0d6e11b Mon Sep 17 00:00:00 2001
> From: liuhongt <hongtao.liu@intel.com>
> Date: Tue, 13 Oct 2020 15:35:29 +0800
> Subject: [PATCH] Simplify vec_select of a subreg of X to just a vec_select of
> X.
>
> gcc/ChangeLog
> PR rtl-optimization/97249
> * simplify-rtx.c (simplify_binary_operation_1): Simplify
> vec_select of a subreg of X to a vec_select of X.
>
> gcc/testsuite/ChangeLog
>
> * gcc.target/i386/pr97249-1.c: New test.
> ---
> gcc/simplify-rtx.c | 44 +++++++++++++++++++++++
> gcc/testsuite/gcc.target/i386/pr97249-1.c | 30 ++++++++++++++++
> 2 files changed, 74 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr97249-1.c
>
> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> index 869f0d11b2e..b1009837b2b 100644
> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -4170,6 +4170,50 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
> return subop1;
> }
> }
> +
> + /* Simplify vec_select of a subreg of X to just a vec_select of X
> + when X has same component mode as vec_select. */
> + int l2;
> + unsigned HOST_WIDE_INT subreg_offset = 0;
> + if (GET_CODE (trueop0) == SUBREG
> + && GET_MODE_INNER (mode)
> + == GET_MODE_INNER (GET_MODE (SUBREG_REG (trueop0)))
> + && (GET_MODE_NUNITS (GET_MODE (trueop0))).is_constant (&l0)
Nothing really relies on this last line, and nothing uses l0, so better
to drop it.
> + && (GET_MODE_NUNITS (mode)).is_constant (&l1)
> + && (GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0))))
> + .is_constant (&l2)
> + && known_le (l1, l2)
I'm not sure the last two &&s are really the important condition.
I think we should drop them for the suggestion below.
> + && constant_multiple_p (SUBREG_BYTE (trueop0),
> + GET_MODE_UNIT_BITSIZE (mode),
> + &subreg_offset))
> + {
> +
Excess blank line.
> + gcc_assert (known_eq (XVECLEN (trueop1, 0), l1));
This can just use ==.
> + bool success = true;
> + for (int i = 0; i != l1; i++)
> + {
> + rtx idx = XVECEXP (trueop1, 0, i);
Excess space.
> + if (!CONST_INT_P (idx))
Here I think we should check:
|| maybe_ge (UINTVAL (idx) + subreg_offset, nunits))
where:
poly_uint64 nunits
= GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))).
This makes sure that all indices are in range. In particular, it's
valid for the SUBREG_REG to be narrower than mode, for appropriate
vec_select indices
Thanks,
Richard
More information about the Gcc-patches
mailing list