This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 12/14][Vectorizer] Redefine VEC_RSHIFT_EXPR and vec_shr_optab as endianness-neutral
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Alan Lawrence <alan dot lawrence at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, David Edelsohn <dje dot gcc at gmail dot com>, Aldy Hernandez <aldyh at redhad dot com>, Steve Ellcey <sellcey at mips dot com>, Eric Christopher <echristo at gmail dot com>
- Date: Mon, 22 Sep 2014 12:58:05 +0200
- Subject: Re: [PATCH 12/14][Vectorizer] Redefine VEC_RSHIFT_EXPR and vec_shr_optab as endianness-neutral
- Authentication-results: sourceware.org; auth=none
- References: <541AC4D2 dot 9040901 at arm dot com> <541AD353 dot 6020405 at arm dot com>
On Thu, Sep 18, 2014 at 2:42 PM, Alan Lawrence <alan.lawrence@arm.com> wrote:
> The direction of VEC_RSHIFT_EXPR has been endian-dependent, contrary to the
> general principles of tree. This patch updates fold-const and the vectorizer
> (the only place where such expressions are created), such that
> VEC_RSHIFT_EXPR always shifts towards element 0.
>
> The tree code still maps directly onto the vec_shr_optab, and so this patch
> *will break any bigendian platform defining the vec_shr optab*.
> --> For AArch64_be, patch follows next in series;
> --> For PowerPC, I think patch/rfc 15 should fix, please inspect;
> --> For MIPS, I think patch/rfc 16 should fix, please inspect.
>
> gcc/ChangeLog:
>
> * fold-const.c (const_binop): VEC_RSHIFT_EXPR always shifts towards
> element 0.
>
> * tree-vect-loop.c (vect_create_epilog_for_reduction): always
> extract
> the result of a reduction with vector shifts from element 0.
>
> * tree.def (VEC_RSHIFT_EXPR, VEC_LSHIFT_EXPR): Comment shift
> direction.
>
> * doc/md.texi (vec_shr_m, vec_shl_m): Document shift direction.
>
> Testing Done:
>
> Bootstrap and check-gcc on x86_64-none-linux-gnu; check-gcc on
> aarch64-none-elf.
As said elsewhere I'd like the vectorizer to use VEC_PERM_EXPRs
and the generic vec_perm expansion machinery handle the
case where the permute can be expressed using the vec_shr_optab.
You'd have, for a 1-element shift of V4SI x, VEC_PERM <x, { 0, 0, 0, 0
}, {4, 3, 2, 1 }>
I'd say that if the target says it can handle the constant permute just fine
then use the vec_perm_const expansion path.
Richard.