This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] (4.1 stage 2 projects): vectorize reduction, part 2/n





This patch implements detection and vectorization of reduction.

We're recognizing the following computation:

loop:
      a_1 = phi <a_0, a_2>
s1:   x = ...
s2:   a_2 = op <x, a_1>
loop_exit:
      a_3 = phi <a_2>
s3:   use <a_3>
s4:   use <a_3>

as a reduction idiom,
and if there are no other uses of a_1 and a_2 in the loop,
and if it's ok to change the computation order,
we transform it into:

loop:
      va_1 = phi <va_0, va_2>
vs1:  vx = <initialization>
vs2:  va_2 = vop <vx, va_1>
loop_exit:
      a_3 = phi <a_2> (dead)
      va_3 = phi <va_2>
vs3:  va_4 = reduc_op <va_3>
vs4:  a_5 = bit_field_ref <va_4, 0>
s3:   use <a_5>
s4:   use <a_5>


Left to do:

(1) In http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01407.html we drafted
additional ways to support the reduction epilog code. The plan is to
implement:
      if (support direct-reduction)
        generate option 2
      else if (support whole-vector-shift)
        generate option 3
      else
        generate option 4 (scalar computation)
This patch implements option 2. We also want to implement options 3,4.

(2) Support additional reduction operations. In this patch we introduce
support for plus, min and max, but we can also vectorize bitwise
operations, and multiplication. This can be trivially done by adding the
relevant new optabs and tree-codes, and the required bits in
'get_initial_def_for_reduction' and 'reduction_code_for_scalar_code'.

(3) Support additional reduction patterns - e.g. idioms like dot product,
that often involve type-casting, but if the pattern is recognized data
packing/unpacking can be avoided.

(4) Based on this infrastructure, we can also add support for vectorization
of induction (e.g: 'for(i=i;i<N;i++){a[i]=i}').

(5) Based on this infrastructure, we can also add support for vectorization
in the presence of values that are used out of the loop (currently this is
supported only for reduction).

(6) There are several cost considerations that can guide how to initialize
the reduction (see discussion here:
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg00692.html). This patch just
takes the 'adjust-in-epilog' approach, cause I wasn't sure how to take
these considerations into account. We'll want to get back to this one
sometime.

The next patches work down the above list (probably not all of it for 4.1).

Bootstrapped and tested on powerpc-darwin and i686-pc-linux-gnu.
OK for mainline?

thanks,

dorit


Chnagelog:

        * tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): New
        tree-codes.
        * optabs.h (OTI_reduc_smax, OTI_reduc_umax, OTI_reduc_smin,
        OTI_reduc_umin, OTI_reduc_plus): New optabs for reduction.
        (reduc_smax_optab, reduc_umax_optab, reduc_smin_optab,
reduc_umin_optab,
        reduc_plus_optab): New optabs for reduction.
        * expr.c (expand_expr_real_1): Handle new tree-codes.
        * tree-inline.c (estimate_num_insns_1): Handle new tree-codes.
        * tree-pretty-print.c (dump_generic_node, op_prio, op_symbol):
Handle
        new tree-codes.
        * optabs.c (optab_for_tree_code): Handle new tree-codes.
        (init_optabs): Initialize new optabs.
        * genopinit.c (optabs): Define handlers for new optabs.

        * tree-vect-analyze.c (vect_analyze_operations): Fail vectorization
in
        case of a phi that is marked as relevant. Call
vectorizable_reduction.
        (vect_mark_relevant): Phis may be marked as relevant.
        (vect_mark_stmts_to_be_vectorized): The use corresponding to the
        reduction variable in a reduction stmt does not mark its defining
phi
        as relevant. Update documentation accordingly.
        (vect_can_advance_ivs_p): Skip reduction phis.
        * tree-vect-transform.c (vect_get_vec_def_for_operand): Takes
        additional argument. Handle reduction.
        (vect_create_destination_var): Update call to
vect_get_new_vect_var.
        Handle non-vector argument.
        (get_initial_def_for_reduction): New function.
        (vect_create_epilog_for_reduction): New function.
        (vectorizable_reduction): New function.
        (vect_get_new_vect_var): Handle new vect_var_kind.
        (vectorizable_assignment, vectorizable_operation,
vectorizable_store,
        vectorizable_condition): Update call to vect_get_new_vect_var.
        (vect_transform_stmt): Call vectorizable_reduction.
        (vect_update_ivs_after_vectorizer): Skip reduction phis.
        (vect_transform_loop): Skip if stmt is both not relevant and not
live.
        * tree-vectorizer.c (reduction_code_for_scalar_code): New function.
        (vect_is_simple_reduction): Was empty - added implementation.
        * tree-vectorizer.h (vect_scalar_var): New enum vect_var_kind
value.
        (reduc_vec_info_type): New enum vect_def_type value.
        * config/rs6000/altivec.md (reduc_smax_v4si, reduc_smax_v4sf,
        reduc_umax_v4si, reduc_smin_v4si, reduc_umin_v4sf, reduc_smin_v4sf,
        reduc_plus_v4si, reduc_plus_v4sf): New define_expands.

        * tree-vect-analyze.c (vect_determine_vectorization_factor): Remove
        ENABLE_CHECKING around gcc_assert.
        * tree-vect-transform.c (vect_do_peeling_for_loop_bound,
        (vect_do_peeling_for_alignment, vect_transform_loop,
        vect_get_vec_def_for_operand): Likewise.

testsuite/Changelog:

        * lib/target-supports.exp (check_effective_target_vect_reduction):
New.
        * gcc.dg/vect/vect-reduc-1.c: Now vectorizable for vect_reduction
        targets.
        * gcc.dg/vect/vect-reduc-2.c: Likewise.
        * gcc.dg/vect/vect-reduc-3.c: Likewise.

Patch:


(See attached file: mainline.june11.diff)

Attachment: mainline.june11.diff
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]