[PATCH 2/14][Vectorizer] Make REDUC_xxx_EXPR tree codes produce a scalar result

Richard Biener richard.guenther@gmail.com
Mon Sep 22 10:34:00 GMT 2014

On Thu, Sep 18, 2014 at 1:50 PM, Alan Lawrence <alan.lawrence@arm.com> wrote:
> This fixes PR/61114 by redefining the REDUC_{MIN,MAX,PLUS}_EXPR tree codes.
> These are presently documented as producing a vector with the result in
> element 0, and this is inconsistent with their use in tree-vect-loop.c
> (which on bigendian targets pulls the bits out of the wrong end of the
> vector result). This leads to bugs on bigendian targets - see also
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114.
> I discounted "fixing" the vectorizer (to read from element 0) and then
> making bigendian targets (whose architectural insn produces the result in
> lane N-1) permute the result vector, as optimization of vectors in RTL seems
> unlikely to remove such a permute and would lead to a performance
> regression.
> Instead it seems more natural for the tree code to produce a scalar result
> (producing a vector with the result in lane 0 has already caused confusion,
> e.g. https://gcc.gnu.org/ml/gcc-patches/2012-10/msg01100.html).
> However, this patch preserves the meaning of the optab (producing a result
> in lane 0 on little-endian architectures or N-1 on bigendian), thus
> generally avoiding the need to change backends. Thus, expr.c extracts an
> endianness-dependent element from the optab result to give the result
> expected for the tree code.
> Previously posted as an RFC
> https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html , now with an extra
> VIEW_CONVERT_EXPR if the types of the reduction/result do not match.

Huh.  Does that ever happen?  Please use a NOP_EXPR instead of

Ok with that change.


> Testing:
>         x86_86-none-linux-gnu: bootstrap, check-gcc, check-g++
>         aarch64-none-linux-gnu: bootstrap
>         aarch64-none-elf:  check-gcc, check-g++
>         arm-none-eabi: check-gcc
>         aarch64_be-none-elf: check-gcc, showing
>         FAIL->PASS: gcc.dg/vect/no-scevccp-outer-7.c execution test
>         FAIL->PASS: gcc.dg/vect/no-scevccp-outer-13.c execution test
>         Passes the (previously-failing) reduced testcase on
>                 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61114
>         Have also assembler/stage-1 tested that testcase on PowerPC, also
> fixed.

> gcc/ChangeLog:
>         * expr.c (expand_expr_real_2): For REDUC_{MIN,MAX,PLUS}_EXPR, add
>         extract_bit_field around optab result.
>         * fold-const.c (fold_unary_loc): For REDUC_{MIN,MAX,PLUS}_EXPR,
> produce
>         scalar not vector.
>         * tree-cfg.c (verify_gimple_assign_unary): Check result vs operand
> type
>         for REDUC_{MIN,MAX,PLUS}_EXPR.
>         * tree-vect-loop.c (vect_analyze_loop): Update comment.
>         (vect_create_epilog_for_reduction): For direct vector reduction, use
>         result of tree code directly without extract_bit_field.
>         * tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): Update
>         comment.

More information about the Gcc-patches mailing list