This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi, Current loop-aware SLP originates from groups of adjacent (strided) stores. This patch adds an ability to start SLP from a group of reductions, such as: sum0 = 0; sum1 = 2; for (i = 0; i < n; i++) { sum0 += a[2*i]; sum1 += a[2*i+1]; } will be now vectorized using SLP: (assuming vectorization factor 4) vsum = {0,2,0,0} for (i = 0; i < n; i+=4) vsum += {a[4*i], a[4*i+1], a[4*i+2], a[4*i+3]}; sum0 = vsum[0] + vsum[2]; sum1 = vsum[1] + vsum[3]; This patch fixes PR 37027. Bootstrapped on x86_64-suse-linux, tested on x86_64-suse-linux and powerpc64-suse-linux. Committed. Ira ChangeLog: PR tree-optimization/37027 * tree-vectorizer.h (struct _loop_vec_info): Add new field reductions and macro to access it. (vectorizable_reduction): Add argument. (vect_get_slp_defs): Likewise. * tree-vect-loop.c (vect_analyze_scalar_cycles_1): Collect reduction statements for possible use in SLP. (new_loop_vec_info): Initialize LOOP_VINFO_REDUCTIONS. (destroy_loop_vec_info): Free LOOP_VINFO_REDUCTIONS. (vect_create_epilog_for_reduction): Handle SLP. Modify documentation, add new argument. (vectorizable_reduction): Likewise. * tree-vect-stmts.c (vect_get_vec_defs): Update call to vect_get_slp_defs. (vectorizable_type_demotion, vectorizable_type_promotion, vectorizable_store): Likewise. (vect_analyze_stmt): Update call to vectorizable_reduction. (vect_transform_stmt): Likewise. * tree-vect-slp.c (vect_get_and_check_slp_defs): Handle reduction. (vect_build_slp_tree): Fix indentation. Check that there are no loads from different interleaving chains in same node. (vect_slp_rearrange_stmts): New function. (vect_supported_load_permutation_p): Allow load permutations for reductions. Call vect_slp_rearrange_stmts() to rearrange statements inside SLP nodes if necessary. (vect_analyze_slp_instance): Handle reductions. (vect_analyze_slp): Try to build SLP instances originating from groups of reductions. (vect_detect_hybrid_slp_stmts): Skip reduction statements. (vect_get_constant_vectors): Create initial vectors for reductions according to reduction code. Add new argument. (vect_get_slp_defs): Add new argument, pass it to vect_get_constant_vectors. (vect_schedule_slp_instance): Remove SLP tree root statements. testsuite/ChangeLog: PR tree-optimization/37027 * lib/target-supports.exp (check_effective_target_vect_widen_sum_hi_to_si_pattern): New. * gcc.dg/vect/pr37027.c: New test. * gcc.dg/vect/slp-reduc-1.c, gcc.dg/vect/slp-reduc-2.c, gcc.dg/vect/slp-reduc-3.c, gcc.dg/vect/slp-reduc-4.c, gcc.dg/vect/slp-reduc-5.c, gcc.dg/vect/slp-reduc-6.c, gcc.dg/vect/vect-complex-6.c: Likewise. (See attached file: slp-reduc.txt)
Attachment:
slp-reduc.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |