This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Alan Lawrence <alan dot lawrence at arm dot com>
- Cc: Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 7 Oct 2014 09:45:41 +0200
- Subject: Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114
- Authentication-results: sourceware.org; auth=none
- References: <541AC4D2 dot 9040901 at arm dot com> <CAFiYyc0wF_ep0sdTiqgn8KV5Jxzfg4BxmkXJPN7wKJ91Qotakw at mail dot gmail dot com> <5432D1A5 dot 6080208 at arm dot com>
On Mon, Oct 6, 2014 at 7:30 PM, Alan Lawrence <firstname.lastname@example.org> wrote:
> Ok, so unless there are objections, I plan to commit patches 1, 2, 4, 5, and
> which have been previously approved, in that sequence. (Of those, all bar
> 2 are AArch64 only.) I think this is better than maintaining an
> patch series.
> Then I'll get to work on migrating all backends to the new _scal_ optab (and
> removing the vector optab). Certainly I'd like to replace vec_shr/l with
> vec_perm_expr too, but I'm conscious that the end of stage 1 is approaching!
I suppose we all are. It will last until end of October at least
(stage1 of gcc 4.9
ended Nov 22th, certainly a bit late).
I do expect we will continue merging already developed / posted stuff through
stage3 (as usual).
That said, it would be really nice to get rid of VEC_RSHIFT_EXPR.
> Richard Biener wrote:
>> On Thu, Sep 18, 2014 at 1:41 PM, Alan Lawrence <email@example.com>
>>> The end goal here is to remove this code from tree-vect-loop.c
>>> if (BYTES_BIG_ENDIAN)
>>> bitpos = size_binop (MULT_EXPR,
>>> bitsize_int (TYPE_VECTOR_SUBPARTS (vectype)
>>> TYPE_SIZE (scalar_type));
>>> as this is the root cause of PR/61114 (see testcase there, failing on all
>>> bigendian targets supporting reduc_[us]plus_optab). Quoting Richard
>>> "all code conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is
>>> suspicious". The code snippet above is used on two paths:
>>> (Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =
>>> The optab is documented as "the scalar result is stored in the least
>>> significant bits of operand 0", but the tree code as "the first element
>>> the vector holding the result of the reduction of all elements of the
>>> operand". This mismatch means that when the tree code is folded, the code
>>> snippet above reads the result from the wrong end of the vector.
>>> The strategy (as per
>>> https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html) is to define
>>> tree codes and optabs that produce scalar results directly; this seems
>>> better than tying (the element of the vector into which the result is
>>> placed) to (the endianness of the target), and avoids generating extra
>>> on current bigendian targets. However, the previous optabs are retained
>>> now as a migration strategy so as not to break existing backends; moving
>>> individual platforms over will follow.
>>> A complication here is on AArch64, where we directly generate
>>> REDUC_PLUS_EXPRs from intrinsics in gimple_fold_builtin; I temporarily
>>> remove this folding in order to decouple the midend and AArch64 backend.
>> Sounds fine. I hope we can transition all backends for 5.0 and remove
>> the vector variant optabs (maybe renaming the scalar ones).
>>> (Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.
>>> VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the
>>> is defined in an endianness-dependent way, leading to significant
>>> complication in fold-const.c. (Moreover, the "equivalent" vec_shl_optab
>>> never used!). Few platforms appear to handle vec_shr_optab (and fewer
>>> bigendian - I see only PowerPC and MIPS), so it seems pertinent to change
>>> the existing optab to be endianness-neutral.
>>> Patch 10 defines vec_shr for AArch64, for the old specification; patch 13
>>> updates that implementation to fit the new endianness-neutral
>>> serving as a guide for other existing backends. Patches/RFCs 15 and 16
>>> equivalents for MIPS and PowerPC; I haven't tested these but hope they
>>> as useful pointers for the port maintainers.
>>> Finally patch 14 cleans up the affected part of tree-vect-loop.c
>> As said during the individual patches review I'd like the vectorizer to
>> use a VEC_PERM_EXPR instead of VEC_RSHIFT_EXPR (with
>> only whole-element amounts). This means we can remove
>> VEC_RSHIFT_EXPR. It also means that if the backend defines
>> vec_perm_const (which it really should) it can handle the special
>> permutes that boil down to a possibly more efficient vector shift
>> there (a good optimization anyway). Until it does that all backends
>> would at least create correct code (with the endian dependent
>> vec_shr removed).
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2557590
> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
> Registered in England & Wales, Company No: 2548782