This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 0/4][Vectorizer] Reductions: replace VEC_RSHIFT_EXPR with VEC_PERM_EXPR
- From: Richard Biener <rguenther at suse dot de>
- To: Alan Lawrence <alan dot lawrence at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, David Edelsohn <dje dot gcc at gmail dot com>, Catherine Moore <clm at codesourcery dot com>, Matthew Fortune <matthew dot fortune at imgtec dot com>
- Date: Thu, 13 Nov 2014 10:07:16 +0100 (CET)
- Subject: Re: [PATCH 0/4][Vectorizer] Reductions: replace VEC_RSHIFT_EXPR with VEC_PERM_EXPR
- Authentication-results: sourceware.org; auth=none
- References: <54639D83 dot 7090205 at arm dot com>
On Wed, 12 Nov 2014, Alan Lawrence wrote:
> In response to https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01803.html, this
> series removes the VEC_RSHIFT_EXPR, instead using a VEC_PERM_EXPR (with a
> second argument full of constant zeroes) to represent the shift.
>
> I've kept the use of vec_shr optab for platforms that define it, as even on
> platforms with a whole-vector-shift operation, this typically does not work as
> a vec-perm on arbitrary vectors (the shift will pull in zeroes from the end,
> whereas TARGET_VECTORIZE_VEC_PERM_CONST_OK and related machinery allows only
> to check for a shift-like permutation that will work for two arbitrary
> vectors).
That's reasonable for the moment though I expected to use
VEC_PERM <v4si, { 0, 0, 0, 0 }, { 4, 5, 0, 1 }>
for the shift - thus the shifted in vector elements should map
1:1 from the 2nd vector. This means that the target can
answer "yes" to vec_perm_const_ok (v4si, ...) which such
a permute if it can shift in zeros as it then can do
tem = shift-in-zeros
tem2 = vec2 & ~<mask to clear not wanted stuff>
perm_result = tem | tem2;
that is, simply OR in the wanted parts of the 2nd vector. Of
course the actual expansion can special-case a constant or
zero 2nd vector.
Usually targets provide a way of setting vector elements to
all ones or zero with their permute instructions as well.
Of course the above requires adjustments to all targets vec_perm_const_ok
hooks and vec_perm_const expanders so for now asking for vec_shr
is ok, but long-term it shouldn't be needed, even without changes
to the vec_perm_const_ok interface.
Thanks,
Richard.
> I've also changed from the endianness-dependent shift direction of the old
> VEC_RSHIFT_EXPR, to an endian-neutral direction (VEC_PERM_EXPR is inherently
> endian-neutral), changing the meaning of vec_shr_optab to match (as I did in
> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01475.html). As previously, this
> will break any *bigendian* platform defining vec_shr; I see MIPS and RS6000,
> but candidate fixes for both of these have already been posted:
>
> (for MIPS) https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01481.html, although
> I have not been able to test this as there doesn't seem to be any working
> MIPS/Loongson hardware in the Compile Farm;
>
> (for PowerPC) https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01480.html,
> testing in progress.
>
> ARM defines vec_shr only for little-endian; AArch64 does not yet, although in
> previous patch series I both added a vec_shr and made it endian-neutral (I
> will post revised patches for both of these shortly).
>
> Bootstrapped and check-gcc on x86-none-linux-gnu and arm-none-linux-gnu;
> cross-tested on aarch64{,_be}-none-elf (FWIW, both with and without previous
> patches adding a vec_shr pattern)
>
> Ok for trunk if no regressions on PowerPC?
>
> Thanks, Alan