This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] SLP vectorization of reductions


Hi,

Current loop-aware SLP originates from groups of adjacent (strided) stores.
This patch adds an ability to start SLP from a group of reductions, such
as:

sum0 = 0;
sum1 = 2;
for (i = 0; i < n; i++)
  {
    sum0 += a[2*i];
    sum1 += a[2*i+1];
  }

will be now vectorized using SLP:
(assuming vectorization factor 4)

vsum  = {0,2,0,0}
for (i = 0; i < n; i+=4)
  vsum += {a[4*i], a[4*i+1], a[4*i+2], a[4*i+3]};

sum0 = vsum[0] + vsum[2];
sum1 = vsum[1] + vsum[3];


This patch fixes PR 37027.


Bootstrapped on x86_64-suse-linux, tested on x86_64-suse-linux and
powerpc64-suse-linux.
Committed.

Ira

ChangeLog:

      PR tree-optimization/37027
      * tree-vectorizer.h (struct _loop_vec_info): Add new field reductions
      and macro to access it.
      (vectorizable_reduction): Add argument.
      (vect_get_slp_defs): Likewise.
      * tree-vect-loop.c (vect_analyze_scalar_cycles_1): Collect reduction
      statements for possible use in SLP.
      (new_loop_vec_info): Initialize LOOP_VINFO_REDUCTIONS.
      (destroy_loop_vec_info): Free LOOP_VINFO_REDUCTIONS.
      (vect_create_epilog_for_reduction): Handle SLP. Modify documentation,
      add new argument.
      (vectorizable_reduction): Likewise.
      * tree-vect-stmts.c (vect_get_vec_defs): Update call to
      vect_get_slp_defs.
      (vectorizable_type_demotion, vectorizable_type_promotion,
      vectorizable_store): Likewise.
      (vect_analyze_stmt): Update call to vectorizable_reduction.
      (vect_transform_stmt): Likewise.
      * tree-vect-slp.c (vect_get_and_check_slp_defs): Handle reduction.
      (vect_build_slp_tree): Fix indentation. Check that there are no loads
      from different interleaving chains in same node.
      (vect_slp_rearrange_stmts): New function.
      (vect_supported_load_permutation_p): Allow load permutations for
      reductions. Call vect_slp_rearrange_stmts() to rearrange statements
      inside SLP nodes if necessary.
      (vect_analyze_slp_instance): Handle reductions.
      (vect_analyze_slp): Try to build SLP instances originating from
groups
      of reductions.
      (vect_detect_hybrid_slp_stmts): Skip reduction statements.
      (vect_get_constant_vectors): Create initial vectors for reductions
      according to reduction code. Add new argument.
      (vect_get_slp_defs): Add new argument, pass it to
      vect_get_constant_vectors.
      (vect_schedule_slp_instance): Remove SLP tree root statements.


testsuite/ChangeLog:

      PR tree-optimization/37027
      * lib/target-supports.exp
      (check_effective_target_vect_widen_sum_hi_to_si_pattern): New.
      * gcc.dg/vect/pr37027.c: New test.
      * gcc.dg/vect/slp-reduc-1.c, gcc.dg/vect/slp-reduc-2.c,
      gcc.dg/vect/slp-reduc-3.c, gcc.dg/vect/slp-reduc-4.c,
      gcc.dg/vect/slp-reduc-5.c, gcc.dg/vect/slp-reduc-6.c,
      gcc.dg/vect/vect-complex-6.c: Likewise.


(See attached file: slp-reduc.txt)

Attachment: slp-reduc.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]