This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Make BB vectorizer work on sub-BBs
- From: Christophe Lyon <christophe dot lyon at linaro dot org>
- To: Richard Biener <rguenther at suse dot de>
- Cc: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 10 Nov 2015 16:20:15 +0100
- Subject: Re: [PATCH] Make BB vectorizer work on sub-BBs
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 11 dot 1511061025130 dot 10078 at zhemvz dot fhfr dot qr> <alpine dot LSU dot 2 dot 11 dot 1511061204350 dot 10078 at zhemvz dot fhfr dot qr> <563C8AF3 dot 6000008 at arm dot com> <CAKdteOZ+wxwMNwoipecaGisT+gDsb3w_W4FnRzdjCQwW1ka=wA at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1511101402100 dot 10078 at zhemvz dot fhfr dot qr>
On 10 November 2015 at 14:02, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 10 Nov 2015, Christophe Lyon wrote:
>
>> On 6 November 2015 at 12:11, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> > Hi Richard,
>> >
>> >
>> > On 06/11/15 11:09, Richard Biener wrote:
>> >>
>> >> On Fri, 6 Nov 2015, Richard Biener wrote:
>> >>
>> >>> The following patch makes the BB vectorizer not only handle BB heads
>> >>> (until the first stmt with a data reference it cannot handle) but
>> >>> arbitrary regions in a BB separated by such stmts.
>> >>>
>> >>> This improves the number of BB vectorizations from 469 to 556
>> >>> in a quick test on SPEC CPU 2006 with -Ofast on x86_64 and
>> >>> 1x400.perlbench 1x410.bwaves 1x416.gamess 1x450.soplex 1x453.povray
>> >>> 1x481.wrf failing both patched and unpatched (have to update my
>> >>> config used for such experiments it seems ...)
>> >>>
>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, aarch64 cross built.
>> >>>
>> >>> I'm currently re-testing for a cosmetic change I made when writing
>> >>> the changelog.
>> >>>
>> >>> I expected (and there are) some issues with compile-time. Left
>> >>> is unpatched and right is patched.
>> >>>
>> >>> '403.gcc': 00:00:54 (54) | '403.gcc': 00:00:55 (55)
>> >>> '483.xalancbmk': 00:02:20 (140) | '483.xalancbmk': 00:02:24 (144)
>> >>> '416.gamess': 00:02:36 (156) | '416.gamess': 00:02:37 (157)
>> >>> '435.gromacs': 00:00:18 (18) | '435.gromacs': 00:00:19 (19)
>> >>> '447.dealII': 00:01:31 (91) | '447.dealII': 00:01:33 (93)
>> >>> '453.povray': 00:04:54 (294) | '453.povray': 00:08:54 (534)
>> >>> '454.calculix': 00:00:34 (34) | '454.calculix': 00:00:52 (52)
>> >>> '481.wrf': 00:01:57 (117) | '481.wrf': 00:01:59 (119)
>> >>>
>> >>> other benchmarks are unchanged. I'm double-checking now that a followup
>> >>> patch I have which re-implements BB vectorization dependence checking
>> >>> fixes this (that's the only quadraticness I know of).
>> >>
>> >> Fixes all but
>> >>
>> >> '453.povray': 00:04:54 (294) | '453.povray': 00:06:46 (406)
>> >
>> >
>> > Note that povray is currently suffering from PR 68198
>> >
>>
>> Hi,
>>
>> I've also noticed that the new test bb-slp-38 fails on armeb:
>> FAIL: gcc.dg/vect/bb-slp-38.c -flto -ffat-lto-objects
>> scan-tree-dump-times slp2 "basic block part vectorized" 2
>> FAIL: gcc.dg/vect/bb-slp-38.c scan-tree-dump-times slp2 "basic block
>> part vectorized" 2
>>
>> I haven't checked in more detail, maybe it's similar to what we
>> discussed in PR65962
>
> Maybe though there is no misalignment involved as far as I can see.
>
> Please open a bug and attach vectorizer dumps.
>
OK, this is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68275
> Richard.
>
>> > Kyrill
>> >
>> >
>> >>
>> >> it even improves compile-time on some:
>> >>
>> >> '464.h264ref': 00:00:26 (26) | '464.h264ref': 00:00:21 (21)
>> >>
>> >> it also increases the number of vectorized BBs to 722.
>> >>
>> >> Needs some work still though.
>> >>
>> >> Richard.
>> >>
>> >>> Richard.
>> >>>
>> >>> 2015-11-06 Richard Biener <rguenther@suse.de>
>> >>>
>> >>> * tree-vectorizer.h (struct _bb_vec_info): Add region_begin/end
>> >>> members.
>> >>> (vect_stmt_in_region_p): Declare.
>> >>> * tree-vect-slp.c (new_bb_vec_info): Work on a region.
>> >>> (destroy_bb_vec_info): Likewise.
>> >>> (vect_bb_slp_scalar_cost): Use vect_stmt_in_region_p.
>> >>> (vect_get_and_check_slp_defs): Likewise.
>> >>> (vect_slp_analyze_bb_1): Refactor to make it work on sub-BBs.
>> >>> (vect_slp_bb): Likewise.
>> >>> * tree-vect-patterns.c (vect_same_loop_or_bb_p): Implement
>> >>> in terms of vect_stmt_in_region_p.
>> >>> (vect_pattern_recog): Iterate over the BB region.
>> >>> * tree-vect-stmts.c (vect_is_simple_use): Use
>> >>> vect_stmt_in_region_p.
>> >>> * tree-vectorizer.c (vect_stmt_in_region_p): New function.
>> >>> (pass_slp_vectorize::execute): Initialize all stmt UIDs to -1.
>> >>>
>> >>> * config/i386/i386.c: Include gimple-iterator.h.
>> >>> * config/aarch64/aarch64.c: Likewise.
>> >>>
>> >>> * gcc.dg/vect/bb-slp-38.c: New testcase.
>> >>>
>> >>> Index: gcc/tree-vectorizer.h
>> >>> ===================================================================
>> >>> *** gcc/tree-vectorizer.h.orig 2015-11-05 09:52:00.640227178 +0100
>> >>> --- gcc/tree-vectorizer.h 2015-11-05 13:20:58.385786476 +0100
>> >>> *************** nested_in_vect_loop_p (struct loop *loop
>> >>> *** 390,395 ****
>> >>> --- 390,397 ----
>> >>> typedef struct _bb_vec_info : public vec_info
>> >>> {
>> >>> basic_block bb;
>> >>> + gimple_stmt_iterator region_begin;
>> >>> + gimple_stmt_iterator region_end;
>> >>> } *bb_vec_info;
>> >>> #define BB_VINFO_BB(B) (B)->bb
>> >>> *************** void vect_pattern_recog (vec_info *);
>> >>> *** 1085,1089 ****
>> >>> --- 1087,1092 ----
>> >>> /* In tree-vectorizer.c. */
>> >>> unsigned vectorize_loops (void);
>> >>> void vect_destroy_datarefs (vec_info *);
>> >>> + bool vect_stmt_in_region_p (vec_info *, gimple *);
>> >>> #endif /* GCC_TREE_VECTORIZER_H */
>> >>> Index: gcc/tree-vect-slp.c
>> >>> ===================================================================
>> >>> *** gcc/tree-vect-slp.c.orig 2015-11-05 09:52:00.640227178 +0100
>> >>> --- gcc/tree-vect-slp.c 2015-11-06 10:22:56.707880233 +0100
>> >>> *************** vect_get_and_check_slp_defs (vec_info *v
>> >>> *** 209,215 ****
>> >>> unsigned int i, number_of_oprnds;
>> >>> gimple *def_stmt;
>> >>> enum vect_def_type dt = vect_uninitialized_def;
>> >>> - struct loop *loop = NULL;
>> >>> bool pattern = false;
>> >>> slp_oprnd_info oprnd_info;
>> >>> int first_op_idx = 1;
>> >>> --- 209,214 ----
>> >>> *************** vect_get_and_check_slp_defs (vec_info *v
>> >>> *** 218,226 ****
>> >>> bool first = stmt_num == 0;
>> >>> bool second = stmt_num == 1;
>> >>> - if (is_a <loop_vec_info> (vinfo))
>> >>> - loop = LOOP_VINFO_LOOP (as_a <loop_vec_info> (vinfo));
>> >>> -
>> >>> if (is_gimple_call (stmt))
>> >>> {
>> >>> number_of_oprnds = gimple_call_num_args (stmt);
>> >>> --- 217,222 ----
>> >>> *************** again:
>> >>> *** 276,286 ****
>> >>> from the pattern. Check that all the stmts of the node are
>> >>> in the
>> >>> pattern. */
>> >>> if (def_stmt && gimple_bb (def_stmt)
>> >>> ! && ((is_a <loop_vec_info> (vinfo)
>> >>> ! && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
>> >>> ! || (is_a <bb_vec_info> (vinfo)
>> >>> ! && gimple_bb (def_stmt) == as_a <bb_vec_info>
>> >>> (vinfo)->bb
>> >>> ! && gimple_code (def_stmt) != GIMPLE_PHI))
>> >>> && vinfo_for_stmt (def_stmt)
>> >>> && STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (def_stmt))
>> >>> && !STMT_VINFO_RELEVANT (vinfo_for_stmt (def_stmt))
>> >>> --- 272,278 ----
>> >>> from the pattern. Check that all the stmts of the node are
>> >>> in the
>> >>> pattern. */
>> >>> if (def_stmt && gimple_bb (def_stmt)
>> >>> ! && vect_stmt_in_region_p (vinfo, def_stmt)
>> >>> && vinfo_for_stmt (def_stmt)
>> >>> && STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (def_stmt))
>> >>> && !STMT_VINFO_RELEVANT (vinfo_for_stmt (def_stmt))
>> >>> *************** vect_detect_hybrid_slp (loop_vec_info lo
>> >>> *** 2076,2091 ****
>> >>> stmt_vec_info structs for all the stmts in it. */
>> >>> static bb_vec_info
>> >>> ! new_bb_vec_info (basic_block bb)
>> >>> {
>> >>> bb_vec_info res = NULL;
>> >>> gimple_stmt_iterator gsi;
>> >>> res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>> >>> res->kind = vec_info::bb;
>> >>> BB_VINFO_BB (res) = bb;
>> >>> ! for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> {
>> >>> gimple *stmt = gsi_stmt (gsi);
>> >>> gimple_set_uid (stmt, 0);
>> >>> --- 2068,2088 ----
>> >>> stmt_vec_info structs for all the stmts in it. */
>> >>> static bb_vec_info
>> >>> ! new_bb_vec_info (gimple_stmt_iterator region_begin,
>> >>> ! gimple_stmt_iterator region_end)
>> >>> {
>> >>> + basic_block bb = gsi_bb (region_begin);
>> >>> bb_vec_info res = NULL;
>> >>> gimple_stmt_iterator gsi;
>> >>> res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>> >>> res->kind = vec_info::bb;
>> >>> BB_VINFO_BB (res) = bb;
>> >>> + res->region_begin = region_begin;
>> >>> + res->region_end = region_end;
>> >>> ! for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
>> >>> ! gsi_next (&gsi))
>> >>> {
>> >>> gimple *stmt = gsi_stmt (gsi);
>> >>> gimple_set_uid (stmt, 0);
>> >>> *************** destroy_bb_vec_info (bb_vec_info bb_vinf
>> >>> *** 2118,2124 ****
>> >>> bb = BB_VINFO_BB (bb_vinfo);
>> >>> ! for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
>> >>> {
>> >>> gimple *stmt = gsi_stmt (si);
>> >>> stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> >>> --- 2115,2122 ----
>> >>> bb = BB_VINFO_BB (bb_vinfo);
>> >>> ! for (si = bb_vinfo->region_begin;
>> >>> ! gsi_stmt (si) != gsi_stmt (bb_vinfo->region_end); gsi_next
>> >>> (&si))
>> >>> {
>> >>> gimple *stmt = gsi_stmt (si);
>> >>> stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> >>> *************** destroy_bb_vec_info (bb_vec_info bb_vinf
>> >>> *** 2126,2131 ****
>> >>> --- 2124,2132 ----
>> >>> if (stmt_info)
>> >>> /* Free stmt_vec_info. */
>> >>> free_stmt_vec_info (stmt);
>> >>> +
>> >>> + /* Reset region marker. */
>> >>> + gimple_set_uid (stmt, -1);
>> >>> }
>> >>> vect_destroy_datarefs (bb_vinfo);
>> >>> *************** vect_bb_slp_scalar_cost (basic_block bb,
>> >>> *** 2247,2254 ****
>> >>> gimple *use_stmt;
>> >>> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR
>> >>> (def_p))
>> >>> if (!is_gimple_debug (use_stmt)
>> >>> ! && (gimple_code (use_stmt) == GIMPLE_PHI
>> >>> ! || gimple_bb (use_stmt) != bb
>> >>> || !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt
>> >>> (use_stmt))))
>> >>> {
>> >>> (*life)[i] = true;
>> >>> --- 2248,2255 ----
>> >>> gimple *use_stmt;
>> >>> FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR
>> >>> (def_p))
>> >>> if (!is_gimple_debug (use_stmt)
>> >>> ! && (! vect_stmt_in_region_p (vinfo_for_stmt
>> >>> (stmt)->vinfo,
>> >>> ! use_stmt)
>> >>> || !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt
>> >>> (use_stmt))))
>> >>> {
>> >>> (*life)[i] = true;
>> >>> *************** vect_bb_vectorization_profitable_p (bb_v
>> >>> *** 2327,2366 ****
>> >>> /* Check if the basic block can be vectorized. */
>> >>> static bb_vec_info
>> >>> ! vect_slp_analyze_bb_1 (basic_block bb)
>> >>> {
>> >>> bb_vec_info bb_vinfo;
>> >>> vec<slp_instance> slp_instances;
>> >>> slp_instance instance;
>> >>> int i;
>> >>> int min_vf = 2;
>> >>> - unsigned n_stmts = 0;
>> >>> ! bb_vinfo = new_bb_vec_info (bb);
>> >>> if (!bb_vinfo)
>> >>> return NULL;
>> >>> ! /* Gather all data references in the basic-block. */
>> >>> !
>> >>> ! for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
>> >>> ! !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> ! {
>> >>> ! gimple *stmt = gsi_stmt (gsi);
>> >>> ! if (is_gimple_debug (stmt))
>> >>> ! continue;
>> >>> ! ++n_stmts;
>> >>> ! if (!find_data_references_in_stmt (NULL, stmt,
>> >>> ! &BB_VINFO_DATAREFS (bb_vinfo)))
>> >>> ! {
>> >>> ! /* Mark the rest of the basic-block as unvectorizable. */
>> >>> ! for (; !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> ! {
>> >>> ! stmt = gsi_stmt (gsi);
>> >>> ! STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false;
>> >>> ! }
>> >>> ! break;
>> >>> ! }
>> >>> ! }
>> >>> /* Analyze the data references. */
>> >>> --- 2328,2358 ----
>> >>> /* Check if the basic block can be vectorized. */
>> >>> static bb_vec_info
>> >>> ! vect_slp_analyze_bb_1 (gimple_stmt_iterator region_begin,
>> >>> ! gimple_stmt_iterator region_end,
>> >>> ! vec<data_reference_p> datarefs, int n_stmts)
>> >>> {
>> >>> bb_vec_info bb_vinfo;
>> >>> vec<slp_instance> slp_instances;
>> >>> slp_instance instance;
>> >>> int i;
>> >>> int min_vf = 2;
>> >>> ! if (n_stmts > PARAM_VALUE (PARAM_SLP_MAX_INSNS_IN_BB))
>> >>> ! {
>> >>> ! if (dump_enabled_p ())
>> >>> ! dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >>> ! "not vectorized: too many instructions in "
>> >>> ! "basic block.\n");
>> >>> ! free_data_refs (datarefs);
>> >>> ! return NULL;
>> >>> ! }
>> >>> !
>> >>> ! bb_vinfo = new_bb_vec_info (region_begin, region_end);
>> >>> if (!bb_vinfo)
>> >>> return NULL;
>> >>> ! BB_VINFO_DATAREFS (bb_vinfo) = datarefs;
>> >>> /* Analyze the data references. */
>> >>> *************** vect_slp_analyze_bb_1 (basic_block bb)
>> >>> *** 2438,2445 ****
>> >>> }
>> >>> /* Mark all the statements that we do not want to vectorize. */
>> >>> ! for (gimple_stmt_iterator gsi = gsi_start_bb (BB_VINFO_BB
>> >>> (bb_vinfo));
>> >>> ! !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> {
>> >>> stmt_vec_info vinfo = vinfo_for_stmt (gsi_stmt (gsi));
>> >>> if (STMT_SLP_TYPE (vinfo) != pure_slp)
>> >>> --- 2430,2437 ----
>> >>> }
>> >>> /* Mark all the statements that we do not want to vectorize. */
>> >>> ! for (gimple_stmt_iterator gsi = bb_vinfo->region_begin;
>> >>> ! gsi_stmt (gsi) != gsi_stmt (bb_vinfo->region_end); gsi_next
>> >>> (&gsi))
>> >>> {
>> >>> stmt_vec_info vinfo = vinfo_for_stmt (gsi_stmt (gsi));
>> >>> if (STMT_SLP_TYPE (vinfo) != pure_slp)
>> >>> *************** bool
>> >>> *** 2509,2585 ****
>> >>> vect_slp_bb (basic_block bb)
>> >>> {
>> >>> bb_vec_info bb_vinfo;
>> >>> - int insns = 0;
>> >>> gimple_stmt_iterator gsi;
>> >>> unsigned int vector_sizes;
>> >>> if (dump_enabled_p ())
>> >>> dump_printf_loc (MSG_NOTE, vect_location,
>> >>> "===vect_slp_analyze_bb===\n");
>> >>> - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> - {
>> >>> - gimple *stmt = gsi_stmt (gsi);
>> >>> - if (!is_gimple_debug (stmt)
>> >>> - && !gimple_nop_p (stmt)
>> >>> - && gimple_code (stmt) != GIMPLE_LABEL)
>> >>> - insns++;
>> >>> - if (gimple_location (stmt) != UNKNOWN_LOCATION)
>> >>> - vect_location = gimple_location (stmt);
>> >>> - }
>> >>> -
>> >>> - if (insns > PARAM_VALUE (PARAM_SLP_MAX_INSNS_IN_BB))
>> >>> - {
>> >>> - if (dump_enabled_p ())
>> >>> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >>> - "not vectorized: too many instructions in "
>> >>> - "basic block.\n");
>> >>> -
>> >>> - return false;
>> >>> - }
>> >>> -
>> >>> /* Autodetect first vector size we try. */
>> >>> current_vector_size = 0;
>> >>> vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
>> >>> while (1)
>> >>> {
>> >>> ! bb_vinfo = vect_slp_analyze_bb_1 (bb);
>> >>> ! if (bb_vinfo)
>> >>> {
>> >>> ! if (!dbg_cnt (vect_slp))
>> >>> ! {
>> >>> ! destroy_bb_vec_info (bb_vinfo);
>> >>> ! return false;
>> >>> ! }
>> >>> if (dump_enabled_p ())
>> >>> ! dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB\n");
>> >>> vect_schedule_slp (bb_vinfo);
>> >>> if (dump_enabled_p ())
>> >>> dump_printf_loc (MSG_NOTE, vect_location,
>> >>> ! "BASIC BLOCK VECTORIZED\n");
>> >>> destroy_bb_vec_info (bb_vinfo);
>> >>> ! return true;
>> >>> }
>> >>> ! destroy_bb_vec_info (bb_vinfo);
>> >>> vector_sizes &= ~current_vector_size;
>> >>> ! if (vector_sizes == 0
>> >>> ! || current_vector_size == 0)
>> >>> ! return false;
>> >>> ! /* Try the next biggest vector size. */
>> >>> ! current_vector_size = 1 << floor_log2 (vector_sizes);
>> >>> ! if (dump_enabled_p ())
>> >>> ! dump_printf_loc (MSG_NOTE, vect_location,
>> >>> ! "***** Re-trying analysis with "
>> >>> ! "vector size %d\n", current_vector_size);
>> >>> }
>> >>> }
>> >>> --- 2501,2605 ----
>> >>> vect_slp_bb (basic_block bb)
>> >>> {
>> >>> bb_vec_info bb_vinfo;
>> >>> gimple_stmt_iterator gsi;
>> >>> unsigned int vector_sizes;
>> >>> + bool any_vectorized = false;
>> >>> if (dump_enabled_p ())
>> >>> dump_printf_loc (MSG_NOTE, vect_location,
>> >>> "===vect_slp_analyze_bb===\n");
>> >>> /* Autodetect first vector size we try. */
>> >>> current_vector_size = 0;
>> >>> vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
>> >>> + gsi = gsi_start_bb (bb);
>> >>> +
>> >>> while (1)
>> >>> {
>> >>> ! if (gsi_end_p (gsi))
>> >>> ! break;
>> >>> !
>> >>> ! gimple_stmt_iterator region_begin = gsi;
>> >>> ! vec<data_reference_p> datarefs = vNULL;
>> >>> ! int insns = 0;
>> >>> !
>> >>> ! for (; !gsi_end_p (gsi); gsi_next (&gsi))
>> >>> {
>> >>> ! gimple *stmt = gsi_stmt (gsi);
>> >>> ! if (is_gimple_debug (stmt))
>> >>> ! continue;
>> >>> ! insns++;
>> >>> !
>> >>> ! if (gimple_location (stmt) != UNKNOWN_LOCATION)
>> >>> ! vect_location = gimple_location (stmt);
>> >>> !
>> >>> ! if (!find_data_references_in_stmt (NULL, stmt, &datarefs))
>> >>> ! break;
>> >>> ! }
>> >>> !
>> >>> ! /* Skip leading unhandled stmts. */
>> >>> ! if (gsi_stmt (region_begin) == gsi_stmt (gsi))
>> >>> ! {
>> >>> ! gsi_next (&gsi);
>> >>> ! continue;
>> >>> ! }
>> >>> !
>> >>> ! gimple_stmt_iterator region_end = gsi;
>> >>> + bool vectorized = false;
>> >>> + bb_vinfo = vect_slp_analyze_bb_1 (region_begin, region_end,
>> >>> + datarefs, insns);
>> >>> + if (bb_vinfo
>> >>> + && dbg_cnt (vect_slp))
>> >>> + {
>> >>> if (dump_enabled_p ())
>> >>> ! dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB
>> >>> part\n");
>> >>> vect_schedule_slp (bb_vinfo);
>> >>> if (dump_enabled_p ())
>> >>> dump_printf_loc (MSG_NOTE, vect_location,
>> >>> ! "basic block part vectorized\n");
>> >>> destroy_bb_vec_info (bb_vinfo);
>> >>> ! vectorized = true;
>> >>> }
>> >>> + else
>> >>> + destroy_bb_vec_info (bb_vinfo);
>> >>> ! any_vectorized |= vectorized;
>> >>> vector_sizes &= ~current_vector_size;
>> >>> ! if (vectorized
>> >>> ! || vector_sizes == 0
>> >>> ! || current_vector_size == 0)
>> >>> ! {
>> >>> ! if (gsi_end_p (region_end))
>> >>> ! break;
>> >>> !
>> >>> ! /* Skip the unhandled stmt. */
>> >>> ! gsi_next (&gsi);
>> >>> !
>> >>> ! /* And reset vector sizes. */
>> >>> ! current_vector_size = 0;
>> >>> ! vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
>> >>> ! }
>> >>> ! else
>> >>> ! {
>> >>> ! /* Try the next biggest vector size. */
>> >>> ! current_vector_size = 1 << floor_log2 (vector_sizes);
>> >>> ! if (dump_enabled_p ())
>> >>> ! dump_printf_loc (MSG_NOTE, vect_location,
>> >>> ! "***** Re-trying analysis with "
>> >>> ! "vector size %d\n", current_vector_size);
>> >>> ! /* Start over. */
>> >>> ! gsi = region_begin;
>> >>> ! }
>> >>> }
>> >>> +
>> >>> + return any_vectorized;
>> >>> }
>> >>> Index: gcc/tree-vect-patterns.c
>> >>> ===================================================================
>> >>> *** gcc/tree-vect-patterns.c.orig 2015-11-05 09:52:00.640227178
>> >>> +0100
>> >>> --- gcc/tree-vect-patterns.c 2015-11-05 13:25:46.060011765 +0100
>> >>> *************** static bool
>> >>> *** 107,133 ****
>> >>> vect_same_loop_or_bb_p (gimple *stmt1, gimple *stmt2)
>> >>> {
>> >>> stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt1);
>> >>> ! loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
>> >>> ! bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
>> >>> !
>> >>> ! if (!gimple_bb (stmt2))
>> >>> ! return false;
>> >>> !
>> >>> ! if (loop_vinfo)
>> >>> ! {
>> >>> ! struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>> >>> ! if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt2)))
>> >>> ! return false;
>> >>> ! }
>> >>> ! else
>> >>> ! {
>> >>> ! if (gimple_bb (stmt2) != BB_VINFO_BB (bb_vinfo)
>> >>> ! || gimple_code (stmt2) == GIMPLE_PHI)
>> >>> ! return false;
>> >>> ! }
>> >>> !
>> >>> ! gcc_assert (vinfo_for_stmt (stmt2));
>> >>> ! return true;
>> >>> }
>> >>> /* If the LHS of DEF_STMT has a single use, and that statement is
>> >>> --- 107,113 ----
>> >>> vect_same_loop_or_bb_p (gimple *stmt1, gimple *stmt2)
>> >>> {
>> >>> stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt1);
>> >>> ! return vect_stmt_in_region_p (stmt_vinfo->vinfo, stmt2);
>> >>> }
>> >>> /* If the LHS of DEF_STMT has a single use, and that statement is
>> >>> *************** vect_pattern_recog (vec_info *vinfo)
>> >>> *** 3611,3643 ****
>> >>> loop = LOOP_VINFO_LOOP (loop_vinfo);
>> >>> bbs = LOOP_VINFO_BBS (loop_vinfo);
>> >>> nbbs = loop->num_nodes;
>> >>> }
>> >>> else
>> >>> {
>> >>> ! bbs = &as_a <bb_vec_info> (vinfo)->bb;
>> >>> ! nbbs = 1;
>> >>> ! }
>> >>> !
>> >>> ! /* Scan through the loop stmts, applying the pattern recognition
>> >>> ! functions starting at each stmt visited: */
>> >>> ! for (i = 0; i < nbbs; i++)
>> >>> ! {
>> >>> ! basic_block bb = bbs[i];
>> >>> ! for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
>> >>> ! {
>> >>> ! if (is_a <bb_vec_info> (vinfo)
>> >>> ! && (stmt = gsi_stmt (si))
>> >>> && vinfo_for_stmt (stmt)
>> >>> && !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)))
>> >>> ! continue;
>> >>> ! /* Scan over all generic vect_recog_xxx_pattern functions.
>> >>> */
>> >>> ! for (j = 0; j < NUM_PATTERNS; j++)
>> >>> ! {
>> >>> vect_recog_func = vect_vect_recog_func_ptrs[j];
>> >>> vect_pattern_recog_1 (vect_recog_func, si,
>> >>> &stmts_to_replace);
>> >>> ! }
>> >>> ! }
>> >>> }
>> >>> }
>> >>> --- 3591,3632 ----
>> >>> loop = LOOP_VINFO_LOOP (loop_vinfo);
>> >>> bbs = LOOP_VINFO_BBS (loop_vinfo);
>> >>> nbbs = loop->num_nodes;
>> >>> +
>> >>> + /* Scan through the loop stmts, applying the pattern recognition
>> >>> + functions starting at each stmt visited: */
>> >>> + for (i = 0; i < nbbs; i++)
>> >>> + {
>> >>> + basic_block bb = bbs[i];
>> >>> + for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
>> >>> + {
>> >>> + /* Scan over all generic vect_recog_xxx_pattern functions.
>> >>> */
>> >>> + for (j = 0; j < NUM_PATTERNS; j++)
>> >>> + {
>> >>> + vect_recog_func = vect_vect_recog_func_ptrs[j];
>> >>> + vect_pattern_recog_1 (vect_recog_func, si,
>> >>> + &stmts_to_replace);
>> >>> + }
>> >>> + }
>> >>> + }
>> >>> }
>> >>> else
>> >>> {
>> >>> ! bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
>> >>> ! for (si = bb_vinfo->region_begin;
>> >>> ! gsi_stmt (si) != gsi_stmt (bb_vinfo->region_end); gsi_next
>> >>> (&si))
>> >>> ! {
>> >>> ! if ((stmt = gsi_stmt (si))
>> >>> && vinfo_for_stmt (stmt)
>> >>> && !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)))
>> >>> ! continue;
>> >>> ! /* Scan over all generic vect_recog_xxx_pattern functions. */
>> >>> ! for (j = 0; j < NUM_PATTERNS; j++)
>> >>> ! {
>> >>> vect_recog_func = vect_vect_recog_func_ptrs[j];
>> >>> vect_pattern_recog_1 (vect_recog_func, si,
>> >>> &stmts_to_replace);
>> >>> ! }
>> >>> ! }
>> >>> }
>> >>> }
>> >>> Index: gcc/config/i386/i386.c
>> >>> ===================================================================
>> >>> *** gcc/config/i386/i386.c.orig 2015-11-05 09:52:42.239687133 +0100
>> >>> --- gcc/config/i386/i386.c 2015-11-05 11:09:09.451774562 +0100
>> >>> *************** along with GCC; see the file COPYING3.
>> >>> *** 64,69 ****
>> >>> --- 64,70 ----
>> >>> #include "context.h"
>> >>> #include "pass_manager.h"
>> >>> #include "target-globals.h"
>> >>> + #include "gimple-iterator.h"
>> >>> #include "tree-vectorizer.h"
>> >>> #include "shrink-wrap.h"
>> >>> #include "builtins.h"
>> >>> Index: gcc/testsuite/gcc.dg/vect/bb-slp-38.c
>> >>> ===================================================================
>> >>> *** /dev/null 1970-01-01 00:00:00.000000000 +0000
>> >>> --- gcc/testsuite/gcc.dg/vect/bb-slp-38.c 2015-11-05
>> >>> 14:00:48.177644327 +0100
>> >>> ***************
>> >>> *** 0 ****
>> >>> --- 1,44 ----
>> >>> + /* { dg-require-effective-target vect_int } */
>> >>> +
>> >>> + #include "tree-vect.h"
>> >>> +
>> >>> + extern void abort (void);
>> >>> +
>> >>> + int a[8], b[8];
>> >>> + int x;
>> >>> +
>> >>> + void __attribute__((noinline,noclone))
>> >>> + bar (void)
>> >>> + {
>> >>> + x = 1;
>> >>> + }
>> >>> +
>> >>> + void __attribute__((noinline,noclone))
>> >>> + foo(void)
>> >>> + {
>> >>> + a[0] = b[0];
>> >>> + a[1] = b[0];
>> >>> + a[2] = b[3];
>> >>> + a[3] = b[3];
>> >>> + bar ();
>> >>> + a[4] = b[4];
>> >>> + a[5] = b[7];
>> >>> + a[6] = b[4];
>> >>> + a[7] = b[7];
>> >>> + }
>> >>> +
>> >>> + int main()
>> >>> + {
>> >>> + int i;
>> >>> + check_vect ();
>> >>> + for (i = 0; i < 8; ++i)
>> >>> + b[i] = i;
>> >>> + foo ();
>> >>> + if (a[0] != 0 || a[1] != 0 || a[2] != 3 || a[3] != 3
>> >>> + || a[4] != 4 || a[5] != 7 || a[6] != 4 || a[7] != 7)
>> >>> + abort ();
>> >>> + return 0;
>> >>> + }
>> >>> +
>> >>> + /* { dg-final { scan-tree-dump "basic block vectorized" "slp2" { target
>> >>> vect_perm } } } */
>> >>> + /* { dg-final { scan-tree-dump-times "basic block part vectorized" 2
>> >>> "slp2" { target vect_perm } } } */
>> >>> Index: gcc/tree-vect-stmts.c
>> >>> ===================================================================
>> >>> *** gcc/tree-vect-stmts.c.orig 2015-11-02 12:37:11.074249388 +0100
>> >>> --- gcc/tree-vect-stmts.c 2015-11-05 13:29:21.413423692 +0100
>> >>> *************** vect_is_simple_use (tree operand, vec_in
>> >>> *** 8196,8207 ****
>> >>> dump_gimple_stmt (MSG_NOTE, TDF_SLIM, *def_stmt, 0);
>> >>> }
>> >>> ! basic_block bb = gimple_bb (*def_stmt);
>> >>> ! if ((is_a <loop_vec_info> (vinfo)
>> >>> ! && !flow_bb_inside_loop_p (as_a <loop_vec_info> (vinfo)->loop,
>> >>> bb))
>> >>> ! || (is_a <bb_vec_info> (vinfo)
>> >>> ! && (bb != as_a <bb_vec_info> (vinfo)->bb
>> >>> ! || gimple_code (*def_stmt) == GIMPLE_PHI)))
>> >>> *dt = vect_external_def;
>> >>> else
>> >>> {
>> >>> --- 8196,8202 ----
>> >>> dump_gimple_stmt (MSG_NOTE, TDF_SLIM, *def_stmt, 0);
>> >>> }
>> >>> ! if (! vect_stmt_in_region_p (vinfo, *def_stmt))
>> >>> *dt = vect_external_def;
>> >>> else
>> >>> {
>> >>> Index: gcc/tree-vectorizer.c
>> >>> ===================================================================
>> >>> *** gcc/tree-vectorizer.c.orig 2015-11-04 09:23:53.724687806 +0100
>> >>> --- gcc/tree-vectorizer.c 2015-11-05 13:55:08.299817570 +0100
>> >>> *************** vect_destroy_datarefs (vec_info *vinfo)
>> >>> *** 350,355 ****
>> >>> --- 350,382 ----
>> >>> }
>> >>> + /* Return whether STMT is inside the region we try to vectorize.
>> >>> */
>> >>> +
>> >>> + bool
>> >>> + vect_stmt_in_region_p (vec_info *vinfo, gimple *stmt)
>> >>> + {
>> >>> + if (!gimple_bb (stmt))
>> >>> + return false;
>> >>> +
>> >>> + if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
>> >>> + {
>> >>> + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>> >>> + if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
>> >>> + return false;
>> >>> + }
>> >>> + else
>> >>> + {
>> >>> + bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
>> >>> + if (gimple_bb (stmt) != BB_VINFO_BB (bb_vinfo)
>> >>> + || gimple_uid (stmt) == -1U
>> >>> + || gimple_code (stmt) == GIMPLE_PHI)
>> >>> + return false;
>> >>> + }
>> >>> +
>> >>> + return true;
>> >>> + }
>> >>> +
>> >>> +
>> >>> /* If LOOP has been versioned during ifcvt, return the internal call
>> >>> guarding it. */
>> >>> *************** pass_slp_vectorize::execute (function *f
>> >>> *** 692,697 ****
>> >>> --- 719,732 ----
>> >>> scev_initialize ();
>> >>> }
>> >>> + /* Mark all stmts as not belonging to the current region. */
>> >>> + FOR_EACH_BB_FN (bb, fun)
>> >>> + {
>> >>> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p
>> >>> (gsi);
>> >>> + gsi_next (&gsi))
>> >>> + gimple_set_uid (gsi_stmt (gsi), -1);
>> >>> + }
>> >>> +
>> >>> init_stmt_vec_info_vec ();
>> >>> FOR_EACH_BB_FN (bb, fun)
>> >>> Index: gcc/config/aarch64/aarch64.c
>> >>> ===================================================================
>> >>> *** gcc/config/aarch64/aarch64.c.orig 2015-10-28 11:22:25.290823112
>> >>> +0100
>> >>> --- gcc/config/aarch64/aarch64.c 2015-11-06 10:24:21.539818027
>> >>> +0100
>> >>> ***************
>> >>> *** 52,57 ****
>> >>> --- 52,58 ----
>> >>> #include "params.h"
>> >>> #include "gimplify.h"
>> >>> #include "dwarf2.h"
>> >>> + #include "gimple-iterator.h"
>> >>> #include "tree-vectorizer.h"
>> >>> #include "aarch64-cost-tables.h"
>> >>> #include "dumpfile.h"
>> >>>
>> >
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)