[PATCH][2/3] final try: Re-organize -fvect-cost-model, enable vectorization at -O2

Richard Biener rguenther@suse.de
Tue May 28 12:37:00 GMT 2013


This is the final variant of the patch working towards enabling
a less costly vectorization variant at -O2 by default.  It introduces
a "cheap" cost-model variant by means of transforming the existing
-fvect-cost-model option to one taking an argument,
"unlimited" (same as -fno-vect-cost-model), "dynamic" (same as
-fvect-cost-model and the default) and "cheap".

With the "cheap" model we try to not disturb non-vectorized code,
thus do not inhibit any PRE and do not perform if-conversion.
We also avoid any loop versioning due to alignment or aliasing.

This makes runtime performance of SPEC CPU 2006 not regress
when comparing -O2 to -O2 -ftree-vectorize -fno-tree-vect-slp
-fvect-cost-model=cheap.  Few progressions remain, so do effects
on compile-time and binary size (more data in [3/3]).  Due
to implementation bugs SLP is not viable for -O2 even though
profitability should be way easier to assess for it.

Independent on whether [3/3] will get positive feedback I'd like
to push this patch in.  Thus, comments welcome - as usual I'll
interpret silence as positive feedback ;)

Re-bootstrap / regtest running on x86_64-unknown-linux-gnu.

Thanks,
Richard.


2013-05-28  Richard Biener  <rguenther@suse.de>

	common/
	* config/i386/i386-common.c (ix86_option_init_struct): Do not
	enable OPT_fvect_cost_model.

	* common.opt (fvect-cost-model=): New option.
	(vect_cost_model): New enum and values.
	(fvect-cost-model): Alias to -fvect-cost-model=dynamic.
	(fno-vect-cost-model): Alias to -fvect-cost-model=unlimited.
	(ftree-vect-loop-version): Ignore.
	* opts.c (default_options_table): Do not set OPT_fvect_cost_model.
	(common_handle_option): Likewise.
	(finish_options): Adjust condition that sets
	PARAM_MAX_STORES_TO_SINK.
	* flag-types.h (enum vect_cost_model): New enum.
	* doc/invoke.texi (ftree-vect-loop-version): Remove.
	(fvect-cost-model): Adjust documentation.
	* targhooks.c (default_add_stmt_cost): Do not check
	flag_vect_cost_model.
	* tree-vectorizer.h (struct _loop_vec_info): Add cost model field.
	(struct _bb_vec_info): Likewise.
	(vectorizer_cost_model): Declare.
	* tree-vect-data-refs.c (vect_peeling_hash_insert): Check the
	loops cost-model flag.
	(vect_peeling_hash_choose_best_peeling): Likewise.
	(vect_enhance_data_refs_alignment): Likewise.  Do not check
	flag_tree_vect_loop_version but check the cost model.
	(vect_mark_for_runtime_alias_test): Do not add runtime alias checks
	for the cheap cost model.
	* tree-vect-loop.c (vect_analyze_loop): Initialize the loops
	cost model flag.
	(vect_estimate_min_profitable_iters): Use the loops cost model flag.
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Initialize and use the BBs
	cost model flag.
	* tree-vectorizer.c (gate_vect_slp): Adjust.
	(vectorizer_cost_model): Return the active cost model.
	* Makefile.in (tree-if-conv.o): Depend on $(TREE_VECTORIZER_H).
	(tree-ssa-pre.o): Likewise.
	* tree-if-conv.c: Include tree-vectorizer.h.
	(gate_tree_if_conversion): Enable if-conversion via the vectorizer
	only if the cost-model is not cheap.
	* tree-ssa-pre.c: Include tree-vectorizer.h.
	(inhibit_phi_insertion): Do not inhibit PHI insertion for the
	cheap vectorizer cost model.

Index: trunk/gcc/common.opt
===================================================================
*** trunk.orig/gcc/common.opt	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/common.opt	2013-05-28 14:14:39.265369281 +0200
*************** EnumValue
*** 1304,1310 ****
  Enum(stack_reuse_level) String(none) Value(SR_NONE)
  
  ftree-loop-if-convert
! Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
  Convert conditional jumps in innermost loops to branchless equivalents
  
  ftree-loop-if-convert-stores
--- 1304,1310 ----
  Enum(stack_reuse_level) String(none) Value(SR_NONE)
  
  ftree-loop-if-convert
! Common Report Var(flag_tree_loop_if_convert) Optimization
  Convert conditional jumps in innermost loops to branchless equivalents
  
  ftree-loop-if-convert-stores
*************** Common RejectNegative Joined UInteger Va
*** 2267,2282 ****
  -ftree-vectorizer-verbose=<number>	This switch is deprecated. Use -fopt-info instead.
  
  ftree-slp-vectorize
! Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization
  Enable basic block vectorization (SLP) on trees
  
  fvect-cost-model
! Common Report Var(flag_vect_cost_model) Optimization
! Enable use of cost model in vectorization
  
  ftree-vect-loop-version
! Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
! Enable loop versioning when doing loop vectorization on trees
  
  ftree-scev-cprop
  Common Report Var(flag_tree_scev_cprop) Init(1) Optimization
--- 2267,2302 ----
  -ftree-vectorizer-verbose=<number>	This switch is deprecated. Use -fopt-info instead.
  
  ftree-slp-vectorize
! Common Report Var(flag_tree_slp_vectorize) Optimization
  Enable basic block vectorization (SLP) on trees
  
+ fvect-cost-model=
+ Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
+ Specifies the cost model for vectorization
+ 
+ Enum
+ Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs)
+ 
+ EnumValue
+ Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)
+ 
+ EnumValue
+ Enum(vect_cost_model) String(dynamic) Value(VECT_COST_MODEL_DYNAMIC)
+ 
+ EnumValue
+ Enum(vect_cost_model) String(cheap) Value(VECT_COST_MODEL_CHEAP)
+ 
  fvect-cost-model
! Common RejectNegative Alias(fvect-cost-model=,dynamic)
! Enables the dynamic vectorizer cost model.  Preserved for backward compatibility.
! 
! fno-vect-cost-model
! Common RejectNegative Alias(fvect-cost-model=,unlimited)
! Enables the unlimited vectorizer cost model.  Preserved for backward compatibility.
  
  ftree-vect-loop-version
! Common Ignore
! Does nothing. Preserved for backward compatibility.
  
  ftree-scev-cprop
  Common Report Var(flag_tree_scev_cprop) Init(1) Optimization
Index: trunk/gcc/opts.c
===================================================================
*** trunk.orig/gcc/opts.c	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/opts.c	2013-05-28 14:14:39.282369470 +0200
*************** static const struct default_options defa
*** 498,504 ****
      { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 },
      { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 },
      { OPT_LEVELS_3_PLUS, OPT_ftree_vectorize, NULL, 1 },
-     { OPT_LEVELS_3_PLUS, OPT_fvect_cost_model, NULL, 1 },
      { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 },
      { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 },
  
--- 498,503 ----
*************** finish_options (struct gcc_options *opts
*** 823,831 ****
  	}
      }
  
!   /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or if-conversion
!      is disabled.  */
!   if (!opts->x_flag_tree_vectorize || !opts->x_flag_tree_loop_if_convert)
      maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
                             opts->x_param_values, opts_set->x_param_values);
  
--- 822,832 ----
  	}
      }
  
!   /* Set PARAM_MAX_STORES_TO_SINK to 0 if vectorization is not enabled
!      or if-conversion is explicitely disabled.  */
!   if (!opts->x_flag_tree_vectorize
!       || (opts_set->x_flag_tree_loop_if_convert
! 	  && !opts->x_flag_tree_loop_if_convert))
      maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
                             opts->x_param_values, opts_set->x_param_values);
  
*************** common_handle_option (struct gcc_options
*** 1597,1604 ****
  	opts->x_flag_gcse_after_reload = value;
        if (!opts_set->x_flag_tree_vectorize)
  	opts->x_flag_tree_vectorize = value;
-       if (!opts_set->x_flag_vect_cost_model)
- 	opts->x_flag_vect_cost_model = value;
        if (!opts_set->x_flag_tree_loop_distribute_patterns)
  	opts->x_flag_tree_loop_distribute_patterns = value;
        break;
--- 1598,1603 ----
Index: trunk/gcc/common/config/i386/i386-common.c
===================================================================
*** trunk.orig/gcc/common/config/i386/i386-common.c	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/common/config/i386/i386-common.c	2013-05-28 14:14:39.309369766 +0200
*************** ix86_option_init_struct (struct gcc_opti
*** 729,735 ****
  
    opts->x_flag_pcc_struct_return = 2;
    opts->x_flag_asynchronous_unwind_tables = 2;
-   opts->x_flag_vect_cost_model = 1;
  }
  
  /* On the x86 -fsplit-stack and -fstack-protector both use the same
--- 729,734 ----
Index: trunk/gcc/flag-types.h
===================================================================
*** trunk.orig/gcc/flag-types.h	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/flag-types.h	2013-05-28 14:14:39.309369766 +0200
*************** enum fp_contract_mode {
*** 191,194 ****
--- 191,202 ----
    FP_CONTRACT_FAST = 2
  };
  
+ /* Vectorizer cost-model.  */
+ enum vect_cost_model {
+   VECT_COST_MODEL_UNLIMITED = 0,
+   VECT_COST_MODEL_CHEAP = 1,
+   VECT_COST_MODEL_DYNAMIC = 2,
+   VECT_COST_MODEL_DEFAULT = 3
+ };
+ 
  #endif /* ! GCC_FLAG_TYPES_H */
Index: trunk/gcc/targhooks.c
===================================================================
*** trunk.orig/gcc/targhooks.c	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/targhooks.c	2013-05-28 14:14:39.321369902 +0200
*************** default_add_stmt_cost (void *data, int c
*** 1050,1070 ****
  {
    unsigned *cost = (unsigned *) data;
    unsigned retval = 0;
  
!   if (flag_vect_cost_model)
!     {
!       tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
!       int stmt_cost = default_builtin_vectorization_cost (kind, vectype,
! 							  misalign);
!       /* Statements in an inner loop relative to the loop being
! 	 vectorized are weighted more heavily.  The value here is
! 	 arbitrary and could potentially be improved with analysis.  */
!       if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info))
! 	count *= 50;  /* FIXME.  */
! 
!       retval = (unsigned) (count * stmt_cost);
!       cost[where] += retval;
!     }
  
    return retval;
  }
--- 1050,1066 ----
  {
    unsigned *cost = (unsigned *) data;
    unsigned retval = 0;
+   tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
+   int stmt_cost = default_builtin_vectorization_cost (kind, vectype,
+ 						      misalign);
+   /* Statements in an inner loop relative to the loop being
+      vectorized are weighted more heavily.  The value here is
+      arbitrary and could potentially be improved with analysis.  */
+   if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info))
+     count *= 50;  /* FIXME.  */
  
!   retval = (unsigned) (count * stmt_cost);
!   cost[where] += retval;
  
    return retval;
  }
Index: trunk/gcc/tree-vect-data-refs.c
===================================================================
*** trunk.orig/gcc/tree-vect-data-refs.c	2013-05-28 13:40:29.000000000 +0200
--- trunk/gcc/tree-vect-data-refs.c	2013-05-28 14:14:39.336370071 +0200
*************** vect_mark_for_runtime_alias_test (ddr_p
*** 173,179 ****
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
  
!   if ((unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) == 0)
      return false;
  
    if (dump_enabled_p ())
--- 173,180 ----
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
  
!   if (loop_vinfo->cost_model == VECT_COST_MODEL_CHEAP
!       || (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) == 0)
      return false;
  
    if (dump_enabled_p ())
*************** vect_peeling_hash_insert (loop_vec_info
*** 1087,1093 ****
        *new_slot = slot;
      }
  
!   if (!supportable_dr_alignment && !flag_vect_cost_model)
      slot->count += VECT_MAX_COST;
  }
  
--- 1088,1095 ----
        *new_slot = slot;
      }
  
!   if (!supportable_dr_alignment
!       && loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED)
      slot->count += VECT_MAX_COST;
  }
  
*************** vect_peeling_hash_choose_best_peeling (l
*** 1197,1203 ****
     res.peel_info.dr = NULL;
     res.body_cost_vec = stmt_vector_for_cost();
  
!    if (flag_vect_cost_model)
       {
         res.inside_cost = INT_MAX;
         res.outside_cost = INT_MAX;
--- 1199,1205 ----
     res.peel_info.dr = NULL;
     res.body_cost_vec = stmt_vector_for_cost();
  
!    if (loop_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED)
       {
         res.inside_cost = INT_MAX;
         res.outside_cost = INT_MAX;
*************** vect_enhance_data_refs_alignment (loop_v
*** 1426,1432 ****
                   vectorization factor.
                   We do this automtically for cost model, since we calculate cost
                   for every peeling option.  */
!               if (!flag_vect_cost_model)
                  possible_npeel_number = vf /nelements;
  
                /* Handle the aligned case. We may decide to align some other
--- 1428,1434 ----
                   vectorization factor.
                   We do this automtically for cost model, since we calculate cost
                   for every peeling option.  */
!               if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED)
                  possible_npeel_number = vf /nelements;
  
                /* Handle the aligned case. We may decide to align some other
*************** vect_enhance_data_refs_alignment (loop_v
*** 1434,1440 ****
                if (DR_MISALIGNMENT (dr) == 0)
                  {
                    npeel_tmp = 0;
!                   if (!flag_vect_cost_model)
                      possible_npeel_number++;
                  }
  
--- 1436,1442 ----
                if (DR_MISALIGNMENT (dr) == 0)
                  {
                    npeel_tmp = 0;
!                   if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED)
                      possible_npeel_number++;
                  }
  
*************** vect_enhance_data_refs_alignment (loop_v
*** 1743,1749 ****
    /* (2) Versioning to force alignment.  */
  
    /* Try versioning if:
!      1) flag_tree_vect_loop_version is TRUE
       2) optimize loop for speed
       3) there is at least one unsupported misaligned data ref with an unknown
          misalignment, and
--- 1745,1751 ----
    /* (2) Versioning to force alignment.  */
  
    /* Try versioning if:
!      1) cost model is not VECT_COST_MODEL_CHEAP
       2) optimize loop for speed
       3) there is at least one unsupported misaligned data ref with an unknown
          misalignment, and
*************** vect_enhance_data_refs_alignment (loop_v
*** 1751,1757 ****
       5) the number of runtime alignment checks is within reason.  */
  
    do_versioning =
! 	flag_tree_vect_loop_version
  	&& optimize_loop_nest_for_speed_p (loop)
  	&& (!loop->inner); /* FORNOW */
  
--- 1753,1759 ----
       5) the number of runtime alignment checks is within reason.  */
  
    do_versioning =
! 	loop_vinfo->cost_model != VECT_COST_MODEL_CHEAP
  	&& optimize_loop_nest_for_speed_p (loop)
  	&& (!loop->inner); /* FORNOW */
  
Index: trunk/gcc/tree-vect-loop.c
===================================================================
*** trunk.orig/gcc/tree-vect-loop.c	2013-05-28 13:47:04.000000000 +0200
--- trunk/gcc/tree-vect-loop.c	2013-05-28 14:14:39.337370082 +0200
*************** vect_analyze_loop (struct loop *loop)
*** 1763,1768 ****
--- 1763,1770 ----
  	  return NULL;
  	}
  
+       loop_vinfo->cost_model = vectorizer_cost_model ();
+ 
        if (vect_analyze_loop_2 (loop_vinfo))
  	{
  	  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
*************** vect_estimate_min_profitable_iters (loop
*** 2636,2642 ****
    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
  
    /* Cost model disabled.  */
!   if (!flag_vect_cost_model)
      {
        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.");
        *ret_min_profitable_niters = 0;
--- 2638,2644 ----
    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
  
    /* Cost model disabled.  */
!   if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED)
      {
        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.");
        *ret_min_profitable_niters = 0;
Index: trunk/gcc/tree-vect-slp.c
===================================================================
*** trunk.orig/gcc/tree-vect-slp.c	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/tree-vect-slp.c	2013-05-28 14:14:39.338370093 +0200
*************** vect_slp_analyze_bb_1 (basic_block bb)
*** 1992,1997 ****
--- 1992,2001 ----
    if (!bb_vinfo)
      return NULL;
  
+   /* For BB vectorization it only matters whether the cost model is
+      enabled or disabled.  */
+   bb_vinfo->cost_model = vectorizer_cost_model ();
+ 
    if (!vect_analyze_data_refs (NULL, bb_vinfo, &min_vf))
      {
        if (dump_enabled_p ())
*************** vect_slp_analyze_bb_1 (basic_block bb)
*** 2093,2099 ****
      }
  
    /* Cost model: check if the vectorization is worthwhile.  */
!   if (flag_vect_cost_model
        && !vect_bb_vectorization_profitable_p (bb_vinfo))
      {
        if (dump_enabled_p ())
--- 2097,2103 ----
      }
  
    /* Cost model: check if the vectorization is worthwhile.  */
!   if (bb_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED
        && !vect_bb_vectorization_profitable_p (bb_vinfo))
      {
        if (dump_enabled_p ())
Index: trunk/gcc/tree-vectorizer.c
===================================================================
*** trunk.orig/gcc/tree-vectorizer.c	2013-05-17 10:55:39.000000000 +0200
--- trunk/gcc/tree-vectorizer.c	2013-05-28 14:21:57.417255737 +0200
*************** LOC vect_location;
*** 73,78 ****
--- 73,88 ----
  /* Vector mapping GIMPLE stmt to stmt_vec_info. */
  vec<vec_void_p> stmt_vec_info_vec;
  
+ /* Return the active vectorizer cost model.  */
+ 
+ enum vect_cost_model
+ vectorizer_cost_model (void)
+ {
+   if (flag_vect_cost_model != VECT_COST_MODEL_DEFAULT)
+     return flag_vect_cost_model;
+   /* The default cost model is the dynamic one.  */
+   return VECT_COST_MODEL_DYNAMIC;
+ }
  
  /* Function vectorize_loops.
  
*************** execute_vect_slp (void)
*** 191,200 ****
  static bool
  gate_vect_slp (void)
  {
!   /* Apply SLP either if the vectorizer is on and the user didn't specify
!      whether to run SLP or not, or if the SLP flag was set by the user.  */
!   return ((flag_tree_vectorize != 0 && flag_tree_slp_vectorize != 0)
!           || flag_tree_slp_vectorize == 1);
  }
  
  struct gimple_opt_pass pass_slp_vectorize =
--- 201,211 ----
  static bool
  gate_vect_slp (void)
  {
!   /* Apply SLP either according to whether the user specified whether to
!      run SLP or not, or according to whether vectorization is enabled.  */
!   if (global_options_set.x_flag_tree_slp_vectorize)
!     return flag_tree_slp_vectorize != 0;
!   return flag_tree_vectorize != 0;
  }
  
  struct gimple_opt_pass pass_slp_vectorize =
Index: trunk/gcc/tree-vectorizer.h
===================================================================
*** trunk.orig/gcc/tree-vectorizer.h	2013-05-17 10:56:07.000000000 +0200
--- trunk/gcc/tree-vectorizer.h	2013-05-28 14:14:39.338370093 +0200
*************** typedef struct _loop_vec_info {
*** 314,319 ****
--- 314,322 ----
       fix it up.  */
    bool operands_swapped;
  
+   /* The cost model to be used for this loop.  */
+   enum vect_cost_model cost_model;
+ 
  } *loop_vec_info;
  
  /* Access Functions.  */
*************** typedef struct _bb_vec_info {
*** 391,396 ****
--- 394,402 ----
    /* Cost data used by the target cost model.  */
    void *target_cost_data;
  
+   /* The cost model to be used for this BB.  */
+   enum vect_cost_model cost_model;
+ 
  } *bb_vec_info;
  
  #define BB_VINFO_BB(B)               (B)->bb
*************** void vect_pattern_recog (loop_vec_info,
*** 1010,1014 ****
--- 1016,1021 ----
  
  /* In tree-vectorizer.c.  */
  unsigned vectorize_loops (void);
+ enum vect_cost_model vectorizer_cost_model (void);
  
  #endif  /* GCC_TREE_VECTORIZER_H  */
Index: trunk/gcc/doc/invoke.texi
===================================================================
*** trunk.orig/gcc/doc/invoke.texi	2013-05-28 13:00:55.000000000 +0200
--- trunk/gcc/doc/invoke.texi	2013-05-28 14:23:21.066189174 +0200
*************** Objective-C and Objective-C++ Dialects}.
*** 419,428 ****
  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
  -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
  -ftree-switch-conversion -ftree-tail-merge @gol
! -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
  -funit-at-a-time -funroll-all-loops -funroll-loops @gol
  -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
! -fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol
  -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}
--- 419,428 ----
  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
  -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
  -ftree-switch-conversion -ftree-tail-merge @gol
! -ftree-ter -ftree-vectorize -ftree-vrp @gol
  -funit-at-a-time -funroll-all-loops -funroll-loops @gol
  -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
! -fvariable-expansion-in-unroller -fvect-cost-model=@var{model} -fvpt -fweb @gol
  -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os -Ofast -Og}
*************** Optimize yet more.  @option{-O3} turns o
*** 6652,6658 ****
  by @option{-O2} and also turns on the @option{-finline-functions},
  @option{-funswitch-loops}, @option{-fpredictive-commoning},
  @option{-fgcse-after-reload}, @option{-ftree-vectorize},
- @option{-fvect-cost-model},
  @option{-ftree-partial-pre} and @option{-fipa-cp-clone} options.
  
  @item -O0
--- 6652,6657 ----
*************** optimizations designed to reduce code si
*** 6669,6675 ****
  @option{-Os} disables the following optimization flags:
  @gccoptlist{-falign-functions  -falign-jumps  -falign-loops @gol
  -falign-labels  -freorder-blocks  -freorder-blocks-and-partition @gol
! -fprefetch-loop-arrays  -ftree-vect-loop-version}
  
  @item -Ofast
  @opindex Ofast
--- 6668,6674 ----
  @option{-Os} disables the following optimization flags:
  @gccoptlist{-falign-functions  -falign-jumps  -falign-loops @gol
  -falign-labels  -freorder-blocks  -freorder-blocks-and-partition @gol
! -fprefetch-loop-arrays}
  
  @item -Ofast
  @opindex Ofast
*************** Perform loop vectorization on trees. Thi
*** 7910,7928 ****
  Perform basic block vectorization on trees. This flag is enabled by default at
  @option{-O3} and when @option{-ftree-vectorize} is enabled.
  
! @item -ftree-vect-loop-version
! @opindex ftree-vect-loop-version
! Perform loop versioning when doing loop vectorization on trees.  When a loop
! appears to be vectorizable except that data alignment or data dependence cannot
! be determined at compile time, then vectorized and non-vectorized versions of
! the loop are generated along with run-time checks for alignment or dependence
! to control which version is executed.  This option is enabled by default
! except at level @option{-Os} where it is disabled.
! 
! @item -fvect-cost-model
  @opindex fvect-cost-model
! Enable cost model for vectorization.  This option is enabled by default at
! @option{-O3}.
  
  @item -ftree-vrp
  @opindex ftree-vrp
--- 7909,7929 ----
  Perform basic block vectorization on trees. This flag is enabled by default at
  @option{-O3} and when @option{-ftree-vectorize} is enabled.
  
! @item -fvect-cost-model=@var{model}
  @opindex fvect-cost-model
! Alter the cost model used for vectorization.  The @var{model} argument
! should be one of @code{unlimited}, @code{dynamic} or @code{cheap}.
! With the @code{unlimited} model the vectorized code-path is assumed
! to be profitable while with the @code{dynamic} model a runtime check
! will guard the vectorized code-path to enable it only for iteration
! counts that will likely execute faster than when executing the original
! scalar loop.  The @code{cheap} model will disable vectorization of
! loops where doing so would be cost prohibitive for example due to
! required runtime checks for data dependence or alignment but otherwise
! is equal to the @code{dynamic} model.  The @code{cheap} model also
! disables enablement transforms that also may apply to loops that may
! not end up being vectorized.
! The default cost model is the @code{dynamic} one.
  
  @item -ftree-vrp
  @opindex ftree-vrp
*************** constraints.  The default value is 0.
*** 9328,9340 ****
  
  @item vect-max-version-for-alignment-checks
  The maximum number of run-time checks that can be performed when
! doing loop versioning for alignment in the vectorizer.  See option
! @option{-ftree-vect-loop-version} for more information.
  
  @item vect-max-version-for-alias-checks
  The maximum number of run-time checks that can be performed when
! doing loop versioning for alias in the vectorizer.  See option
! @option{-ftree-vect-loop-version} for more information.
  
  @item max-iterations-to-track
  The maximum number of iterations of a loop the brute-force algorithm
--- 9329,9339 ----
  
  @item vect-max-version-for-alignment-checks
  The maximum number of run-time checks that can be performed when
! doing loop versioning for alignment in the vectorizer.
  
  @item vect-max-version-for-alias-checks
  The maximum number of run-time checks that can be performed when
! doing loop versioning for alias in the vectorizer.
  
  @item max-iterations-to-track
  The maximum number of iterations of a loop the brute-force algorithm
Index: trunk/gcc/Makefile.in
===================================================================
*** trunk.orig/gcc/Makefile.in	2013-05-22 12:29:31.000000000 +0200
--- trunk/gcc/Makefile.in	2013-05-28 14:23:57.959600836 +0200
*************** tree-ssa-pre.o : tree-ssa-pre.c $(TREE_F
*** 2388,2394 ****
     $(CFGLOOP_H) alloc-pool.h $(BASIC_BLOCK_H) $(BITMAP_H) $(HASH_TABLE_H) \
     $(GIMPLE_H) $(TREE_INLINE_H) tree-iterator.h tree-ssa-sccvn.h $(PARAMS_H) \
     $(DBGCNT_H) tree-scalar-evolution.h $(GIMPLE_PRETTY_PRINT_H) domwalk.h \
!    $(IPA_PROP_H)
  tree-ssa-sccvn.o : tree-ssa-sccvn.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(TREE_H) $(DIAGNOSTIC_H) \
     $(TM_H) coretypes.h $(DUMPFILE_H) $(FLAGS_H) $(CFGLOOP_H) \
--- 2388,2394 ----
     $(CFGLOOP_H) alloc-pool.h $(BASIC_BLOCK_H) $(BITMAP_H) $(HASH_TABLE_H) \
     $(GIMPLE_H) $(TREE_INLINE_H) tree-iterator.h tree-ssa-sccvn.h $(PARAMS_H) \
     $(DBGCNT_H) tree-scalar-evolution.h $(GIMPLE_PRETTY_PRINT_H) domwalk.h \
!    $(IPA_PROP_H) $(TREE_VECTORIZER_H)
  tree-ssa-sccvn.o : tree-ssa-sccvn.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(TREE_H) $(DIAGNOSTIC_H) \
     $(TM_H) coretypes.h $(DUMPFILE_H) $(FLAGS_H) $(CFGLOOP_H) \
*************** tree-nested.o: tree-nested.c $(CONFIG_H)
*** 2435,2441 ****
  tree-if-conv.o: tree-if-conv.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) $(FLAGS_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) \
     $(CFGLOOP_H) $(TREE_DATA_REF_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
!    $(DBGCNT_H) $(GIMPLE_PRETTY_PRINT_H)
  tree-iterator.o : tree-iterator.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \
     coretypes.h $(GGC_H) tree-iterator.h $(GIMPLE_H) gt-tree-iterator.h
  tree-dfa.o : tree-dfa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
--- 2435,2441 ----
  tree-if-conv.o: tree-if-conv.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(TREE_H) $(FLAGS_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) \
     $(CFGLOOP_H) $(TREE_DATA_REF_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
!    $(DBGCNT_H) $(GIMPLE_PRETTY_PRINT_H) $(TREE_VECTORIZER_H)
  tree-iterator.o : tree-iterator.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \
     coretypes.h $(GGC_H) tree-iterator.h $(GIMPLE_H) gt-tree-iterator.h
  tree-dfa.o : tree-dfa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
Index: trunk/gcc/tree-if-conv.c
===================================================================
*** trunk.orig/gcc/tree-if-conv.c	2013-05-27 13:23:43.000000000 +0200
--- trunk/gcc/tree-if-conv.c	2013-05-28 14:18:43.964098216 +0200
*************** along with GCC; see the file COPYING3.
*** 95,100 ****
--- 95,101 ----
  #include "tree-scalar-evolution.h"
  #include "tree-pass.h"
  #include "dbgcnt.h"
+ #include "tree-vectorizer.h"
  
  /* List of basic blocks in if-conversion-suitable order.  */
  static basic_block *ifc_bbs;
*************** main_tree_if_conversion (void)
*** 1848,1856 ****
  static bool
  gate_tree_if_conversion (void)
  {
!   return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0)
! 	  || flag_tree_loop_if_convert == 1
! 	  || flag_tree_loop_if_convert_stores == 1);
  }
  
  struct gimple_opt_pass pass_if_conversion =
--- 1849,1861 ----
  static bool
  gate_tree_if_conversion (void)
  {
!   /* If the option was explicitely specified enable the pass according
!      to that.  */
!   if (global_options_set.x_flag_tree_loop_if_convert
!       || global_options_set.x_flag_tree_loop_if_convert_stores)
!     return flag_tree_loop_if_convert || flag_tree_loop_if_convert_stores;
!   return (flag_tree_vectorize != 0
! 	  && vectorizer_cost_model () != VECT_COST_MODEL_CHEAP);
  }
  
  struct gimple_opt_pass pass_if_conversion =
Index: trunk/gcc/tree-ssa-pre.c
===================================================================
*** trunk.orig/gcc/tree-ssa-pre.c	2013-04-22 13:30:25.000000000 +0200
--- trunk/gcc/tree-ssa-pre.c	2013-05-28 14:23:34.431338296 +0200
*************** along with GCC; see the file COPYING3.
*** 44,49 ****
--- 44,50 ----
  #include "dbgcnt.h"
  #include "domwalk.h"
  #include "ipa-prop.h"
+ #include "tree-vectorizer.h"
  
  /* TODO:
  
*************** inhibit_phi_insertion (basic_block bb, p
*** 3026,3032 ****
    unsigned i;
  
    /* If we aren't going to vectorize we don't inhibit anything.  */
!   if (!flag_tree_vectorize)
      return false;
  
    /* Otherwise we inhibit the insertion when the address of the
--- 3027,3034 ----
    unsigned i;
  
    /* If we aren't going to vectorize we don't inhibit anything.  */
!   if (!flag_tree_vectorize
!       || vectorizer_cost_model () == VECT_COST_MODEL_CHEAP)
      return false;
  
    /* Otherwise we inhibit the insertion when the address of the



More information about the Gcc-patches mailing list