This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][RFC] Add versioning for constant strides for vectorization


This patch adds the capability to the vectorizer to perform versioning
for the case of a constant (suitable) stride.  For example for

subroutine to_product_of(self,a,b,a1,a2)
  complex(kind=8) :: self (:)
  complex(kind=8), intent(in) :: a(:,:)
  complex(kind=8), intent(in) :: b(:)
  integer a1,a2
  do i = 1,a1
    do j = 1,a2
      self(i) = self(i) + a(j,i)*b(j)
    end do
  end do
end subroutine

we can only apply vectorization if the strides of the fastest dimension
of self, a and b are one (they are loaded from the passed array
descriptors and thus appear as (loop invariant) variables).

During the implementation of this I noticed that peeling for
number of iterations (we have to unroll the above loop twice, and so
for an odd number of iterations have a epilogue loop for the remaining
iteration(s)) does not play well with versioning and we end up
vectorizing the wrong loop.  So I just disabled versioning if we
apply peeling with an epilogue loop and instead attach the versioning
condition to the pre-condition of the main loop that skips directly
to the epilogue if the number of iterations is too small.  We obviously
can use the epilogue loop as the non-vectorized version.

This patch also inserts an extra copyprop and dce pass before the
vectorizer so it can recognize the reduction in the above testcase
(LIM has made that reduction non-obvious).  So I noticed that
copyprop does not preserve loop-closed SSA form and fixed that as well.

Some earlier version bootstrapped and tested ok on 
x86_64-unknown-linux-gnu, a final attempt is still running.

I didn't yet performance test this extensively, but it might need
cost-model adjustments and/or need to wait until we have profile
feedback to properly seed vectorizer analysis here.  A micro-benchmark
based on the above loop shows around 15% improvement on AMD K10.

Feedback (and ppc testing) is still welcome of course.

Thanks,
Richard.

2009-01-23  Richard Guenther  <rguenther@suse.de>

	* passes.c (init_optimization_passes): Add copy-prop and dce
	before vectorization.
	* Makefile.in (tree-ssa-copy.o): Add $(CFGLOOP_H) dependency.
	* tree-ssa-copy.c (init_copy_prop): Do not propagate through
	single-argument PHIs if we are in loop-closed SSA form.
	* tree-data-ref.c (dr_analyze_innermost): Allow affine offsets.
	* tree-vect-analyze.c (vect_check_interleaving): Check that
	DR_STEP is constant.
	(vect_enhance_data_refs_alignment): If versioning for strides
	is required do not peel.
	(vect_analyze_data_ref_access): Allow non-constant step of
	a specific form, remember them for versioning.
	* params.def (vect-max-version-for-stride-checks): New param.
	(vect-version-for-stride-value): Likewise.
	* tree-vectorizer.c (slpeel_add_loop_guard): Pass extra guards
	for the pre-condition.
	(slpeel_tree_peel_loop_to_edge): Likewise.
	(new_loop_vec_info): Allocate stride versioning data.
	(destroy_loop_vec_info): Free stride versioning data.
	* tree-vectorizer.h (struct _loop_vec_info): Add variable_strides
	field.
	(LOOP_VINFO_VARIABLE_STRIDES): Define.
	(slpeel_tree_peel_loop_to_edge): Adjust declaration.
	* tree-vect-transform.c (vect_build_loop_niters): Take an
	optional sequence to append stmts.
	(vect_generate_tmps_on_preheader): Likewise.
	(vect_do_peeling_for_loop_bound): Take extra guards for the
	pre-condition.
	(vect_do_peeling_for_alignment): Adjust.
	(vect_create_cond_for_stride_checks): New function.
	(vect_loop_versioning): Take stmt and stmt list to put pre-condition
	guards if we are going to peel.  Do not apply versioning in that
	case.
	(vect_transform_loop): If we are peeling for loop bound only
	record extra pre-conditions, do not apply loop versioning.

	* gcc.dg/vect/fast-math-vect-complex-5.c: New testcase.
	* gfortran.dg/vect/fast-math-vect-complex-1.f90: Likewise.
	* gfortran.dg/vect/fast-math-vect-stride-1.f90: Likewise.

Index: trunk/gcc/passes.c
===================================================================
*** trunk.orig/gcc/passes.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/passes.c	2009-01-23 16:48:50.000000000 +0100
*************** init_optimization_passes (void)
*** 659,664 ****
--- 659,666 ----
  	  NEXT_PASS (pass_graphite_transforms);
  	  NEXT_PASS (pass_iv_canon);
  	  NEXT_PASS (pass_if_conversion);
+ 	  NEXT_PASS (pass_copy_prop);
+ 	  NEXT_PASS (pass_dce_loop);
  	  NEXT_PASS (pass_vectorize);
  	    {
  	      struct opt_pass **p = &pass_vectorize.pass.sub;
Index: trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-5.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- trunk/gcc/testsuite/gcc.dg/vect/fast-math-vect-complex-5.c	2009-01-23 16:48:50.000000000 +0100
***************
*** 0 ****
--- 1,18 ----
+ /* { dg-do compile } */
+ /* { dg-require-effective-target vect_double } */
+ 
+ #define NUM 64
+ _Complex double ad[NUM], bd[NUM], cd[NUM];
+ 
+ void testd(void)
+ {
+   int i;
+   int j;
+ 
+   for (i = 0; i < NUM; i++)
+     for (j = 0; j < NUM; j++)
+       cd[i] = cd[i] + ad[j] * bd[j];
+ }
+ 
+ /* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-complex-1.f90
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-complex-1.f90	2009-01-23 16:48:50.000000000 +0100
***************
*** 0 ****
--- 1,16 ----
+ ! { dg-do compile }
+ 
+ subroutine to_product_of(self,a,b,a1,a2)
+   complex(kind=8) :: self (:)
+   complex(kind=8), intent(in) :: a(:,:)
+   complex(kind=8), intent(in) :: b(:)
+   integer a1,a2
+   do i = 1,a1
+     do j = 1,a2
+       self(i) = self(i) + a(i,j)*b(j)
+     end do
+   end do
+ end subroutine
+ 
+ ! { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } }
+ ! { dg-final { cleanup-tree-dump "vect" } }
Index: trunk/gcc/tree-data-ref.c
===================================================================
*** trunk.orig/gcc/tree-data-ref.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-data-ref.c	2009-01-23 16:48:50.000000000 +0100
*************** dr_analyze_innermost (struct data_refere
*** 708,714 ****
        offset_iv.base = ssize_int (0);
        offset_iv.step = ssize_int (0);
      }
!   else if (!simple_iv (loop, stmt, poffset, &offset_iv, false))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "failed: evolution of offset is not affine.\n");
--- 708,714 ----
        offset_iv.base = ssize_int (0);
        offset_iv.step = ssize_int (0);
      }
!   else if (!simple_iv (loop, stmt, poffset, &offset_iv, true))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "failed: evolution of offset is not affine.\n");
Index: trunk/gcc/tree-vect-analyze.c
===================================================================
*** trunk.orig/gcc/tree-vect-analyze.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-vect-analyze.c	2009-01-23 16:48:50.000000000 +0100
*************** vect_check_interleaving (struct data_ref
*** 1109,1114 ****
--- 1109,1116 ----
    type_size_b = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (drb))));
  
    if (type_size_a != type_size_b
+       || TREE_CODE (DR_STEP (dra)) != INTEGER_CST
+       || TREE_CODE (DR_STEP (drb)) != INTEGER_CST
        || tree_int_cst_compare (DR_STEP (dra), DR_STEP (drb))
        || !types_compatible_p (TREE_TYPE (DR_REF (dra)), 
                                TREE_TYPE (DR_REF (drb))))
*************** vect_enhance_data_refs_alignment (loop_v
*** 1825,1830 ****
--- 1827,1833 ----
    gimple stmt;
    stmt_vec_info stmt_info;
    int vect_versioning_for_alias_required;
+   int vect_versioning_for_strides_required;
  
    if (vect_print_dump_info (REPORT_DETAILS))
      fprintf (vect_dump, "=== vect_enhance_data_refs_alignment ===");
*************** vect_enhance_data_refs_alignment (loop_v
*** 1892,1904 ****
  
    vect_versioning_for_alias_required =
      (VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)) > 0);
  
    /* Temporarily, if versioning for alias is required, we disable peeling
       until we support peeling and versioning.  Often peeling for alignment
       will require peeling for loop-bound, which in turn requires that we
       know how to adjust the loop ivs after the loop.  */
    if (vect_versioning_for_alias_required
!        || !vect_can_advance_ivs_p (loop_vinfo)
        || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
      do_peeling = false;
  
--- 1895,1910 ----
  
    vect_versioning_for_alias_required =
      (VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)) > 0);
+   vect_versioning_for_strides_required =
+     !bitmap_empty_p (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo));
  
    /* Temporarily, if versioning for alias is required, we disable peeling
       until we support peeling and versioning.  Often peeling for alignment
       will require peeling for loop-bound, which in turn requires that we
       know how to adjust the loop ivs after the loop.  */
    if (vect_versioning_for_alias_required
!       || vect_versioning_for_strides_required
!       || !vect_can_advance_ivs_p (loop_vinfo)
        || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
      do_peeling = false;
  
*************** vect_analyze_data_ref_access (struct dat
*** 2349,2357 ****
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step);
  
!   if (!step)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
  	fprintf (vect_dump, "bad data-ref access");
--- 2355,2364 ----
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   HOST_WIDE_INT dr_step;
  
!   if (!step
!       || TREE_CODE (step) != INTEGER_CST)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
  	fprintf (vect_dump, "bad data-ref access");
*************** vect_analyze_data_ref_access (struct dat
*** 2359,2364 ****
--- 2366,2372 ----
      }
  
    /* Don't allow invariant accesses.  */
+   dr_step = TREE_INT_CST_LOW (step);
    if (dr_step == 0)
      return false; 
  
*************** vect_analyze_data_refs (loop_vec_info lo
*** 3563,3568 ****
--- 3571,3620 ----
            return false;
          }
  
+       /* If the non-constant (but loop invariant) step is of the
+ 	 form NAME or NAME * CST where CST is the element size mark
+ 	 this ddr for versioning for strides and re-set DR_STEP
+ 	 to the value we will version for.  Otherwise reject
+ 	 non-constant steps.  */
+       if (TREE_CODE (DR_STEP (dr)) != INTEGER_CST)
+ 	{
+ 	  tree step = DR_STEP (dr);
+ 
+ 	  STRIP_NOPS (step);
+ 	  if (flag_tree_vect_loop_version
+ 	      && (TREE_CODE (step) == SSA_NAME
+ 		  || (TREE_CODE (step) == MULT_EXPR
+ 		      && TREE_CODE (TREE_OPERAND (step, 1)) == INTEGER_CST)))
+ 	    {
+ 	      tree stride;
+ 	      tree newstep;
+ 
+ 	      stride = step;
+ 	      if (TREE_CODE (step) == MULT_EXPR)
+ 		stride = TREE_OPERAND (step, 0);
+ 	      STRIP_NOPS (stride);
+ 	      if (TREE_CODE (stride) != SSA_NAME)
+ 		return false;
+ 
+ 	      bitmap_set_bit (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo),
+ 			      SSA_NAME_VERSION (stride));
+ 	      if (bitmap_count_bits (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo))
+ 		  > (unsigned)PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_STRIDE_CHECKS))
+ 		return false;
+ 
+ 	      /* ???  Delay this change until after versioning or
+ 	         preserve the original step somewhere.  */
+ 	      newstep = build_int_cst (TREE_TYPE (step),
+ 		       PARAM_VALUE (PARAM_VECT_VERSION_FOR_STRIDE_VALUE));
+ 	      if (TREE_CODE (step) == MULT_EXPR)
+ 		newstep = int_const_binop (MULT_EXPR, newstep,
+ 					   TREE_OPERAND (step, 1), false);
+ 	      DR_STEP (dr) = newstep;
+ 	    }
+ 	  else
+ 	    return false;
+ 	}
+ 
        if (!DR_SYMBOL_TAG (dr))
          {
            if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
Index: trunk/gcc/params.def
===================================================================
*** trunk.orig/gcc/params.def	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/params.def	2009-01-23 16:48:50.000000000 +0100
*************** DEFPARAM(PARAM_VECT_MAX_VERSION_FOR_ALIA
*** 506,511 ****
--- 506,521 ----
           "Bound on number of runtime checks inserted by the vectorizer's loop versioning for alias check",
           10, 0, 0)
  
+ DEFPARAM(PARAM_VECT_MAX_VERSION_FOR_STRIDE_CHECKS,
+          "vect-max-version-for-stride-checks",
+          "Bound on number of runtime checks inserted by the vectorizer's loop versioning for stride check",
+          4, 0, 0)
+ 
+ DEFPARAM(PARAM_VECT_VERSION_FOR_STRIDE_VALUE,
+          "vect-version-for-stride-value",
+          "The constant stride in elements the vectorizer uses for loop versioning",
+          1, 0, 0)
+ 
  DEFPARAM(PARAM_MAX_CSELIB_MEMORY_LOCATIONS,
  	 "max-cselib-memory-locations",
  	 "The maximum memory locations recorded by cselib",
Index: trunk/gcc/tree-vectorizer.c
===================================================================
*** trunk.orig/gcc/tree-vectorizer.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-vectorizer.c	2009-01-23 16:48:50.000000000 +0100
*************** slpeel_tree_duplicate_loop_to_edge_cfg (
*** 927,934 ****
     Returns the skip edge.  */
  
  static edge
! slpeel_add_loop_guard (basic_block guard_bb, tree cond, basic_block exit_bb,
! 		       basic_block dom_bb)
  {
    gimple_stmt_iterator gsi;
    edge new_e, enter_e;
--- 927,935 ----
     Returns the skip edge.  */
  
  static edge
! slpeel_add_loop_guard (basic_block guard_bb, tree cond,
! 		       gimple_seq cond_expr_stmt_list,
! 		       basic_block exit_bb, basic_block dom_bb)
  {
    gimple_stmt_iterator gsi;
    edge new_e, enter_e;
*************** slpeel_add_loop_guard (basic_block guard
*** 941,951 ****
    gsi = gsi_last_bb (guard_bb);
  
    cond = force_gimple_operand (cond, &gimplify_stmt_list, true, NULL_TREE);
    cond_stmt = gimple_build_cond (NE_EXPR,
  				 cond, build_int_cst (TREE_TYPE (cond), 0),
  				 NULL_TREE, NULL_TREE);
!   if (gimplify_stmt_list)
!     gsi_insert_seq_after (&gsi, gimplify_stmt_list, GSI_NEW_STMT);
  
    gsi = gsi_last_bb (guard_bb);
    gsi_insert_after (&gsi, cond_stmt, GSI_NEW_STMT);
--- 942,954 ----
    gsi = gsi_last_bb (guard_bb);
  
    cond = force_gimple_operand (cond, &gimplify_stmt_list, true, NULL_TREE);
+   if (gimplify_stmt_list)
+     gimple_seq_add_seq (&cond_expr_stmt_list, gimplify_stmt_list);
    cond_stmt = gimple_build_cond (NE_EXPR,
  				 cond, build_int_cst (TREE_TYPE (cond), 0),
  				 NULL_TREE, NULL_TREE);
!   if (cond_expr_stmt_list)
!     gsi_insert_seq_after (&gsi, cond_expr_stmt_list, GSI_NEW_STMT);
  
    gsi = gsi_last_bb (guard_bb);
    gsi_insert_after (&gsi, cond_stmt, GSI_NEW_STMT);
*************** struct loop*
*** 1151,1157 ****
  slpeel_tree_peel_loop_to_edge (struct loop *loop, 
  			       edge e, tree first_niters, 
  			       tree niters, bool update_first_loop_count,
! 			       unsigned int th, bool check_profitability)
  {
    struct loop *new_loop = NULL, *first_loop, *second_loop;
    edge skip_e;
--- 1154,1161 ----
  slpeel_tree_peel_loop_to_edge (struct loop *loop, 
  			       edge e, tree first_niters, 
  			       tree niters, bool update_first_loop_count,
! 			       unsigned int th, bool check_profitability,
! 			       tree cond_expr, gimple_seq cond_expr_stmt_list)
  {
    struct loop *new_loop = NULL, *first_loop, *second_loop;
    edge skip_e;
*************** slpeel_tree_peel_loop_to_edge (struct lo
*** 1325,1330 ****
--- 1329,1342 ----
  	  pre_condition = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
  				       cost_pre_condition, pre_condition);
  	}
+       if (cond_expr)
+ 	{
+ 	  pre_condition =
+ 	    fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
+ 			 pre_condition,
+ 			 fold_build1 (TRUTH_NOT_EXPR, boolean_type_node,
+ 				      cond_expr));
+ 	}
      }
  
    /* Prologue peeling.  */  
*************** slpeel_tree_peel_loop_to_edge (struct lo
*** 1340,1345 ****
--- 1352,1358 ----
      }
  
    skip_e = slpeel_add_loop_guard (bb_before_first_loop, pre_condition,
+ 				  cond_expr_stmt_list,
                                    bb_before_second_loop, bb_before_first_loop);
    slpeel_update_phi_nodes_for_guard1 (skip_e, first_loop,
  				      first_loop == new_loop,
*************** slpeel_tree_peel_loop_to_edge (struct lo
*** 1377,1383 ****
  
    pre_condition = 
  	fold_build2 (EQ_EXPR, boolean_type_node, first_niters, niters);
!   skip_e = slpeel_add_loop_guard (bb_between_loops, pre_condition,
                                    bb_after_second_loop, bb_before_first_loop);
    slpeel_update_phi_nodes_for_guard2 (skip_e, second_loop,
                                       second_loop == new_loop, &new_exit_bb);
--- 1390,1396 ----
  
    pre_condition = 
  	fold_build2 (EQ_EXPR, boolean_type_node, first_niters, niters);
!   skip_e = slpeel_add_loop_guard (bb_between_loops, pre_condition, NULL,
                                    bb_after_second_loop, bb_before_first_loop);
    slpeel_update_phi_nodes_for_guard2 (skip_e, second_loop,
                                       second_loop == new_loop, &new_exit_bb);
*************** new_loop_vec_info (struct loop *loop)
*** 1714,1719 ****
--- 1727,1733 ----
    LOOP_VINFO_MAY_ALIAS_DDRS (res) =
      VEC_alloc (ddr_p, heap,
  	       PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS));
+   LOOP_VINFO_VARIABLE_STRIDES (res) = BITMAP_ALLOC (NULL);
    LOOP_VINFO_STRIDED_STORES (res) = VEC_alloc (gimple, heap, 10);
    LOOP_VINFO_SLP_INSTANCES (res) = VEC_alloc (slp_instance, heap, 10);
    LOOP_VINFO_SLP_UNROLLING_FACTOR (res) = 1;
*************** destroy_loop_vec_info (loop_vec_info loo
*** 1800,1805 ****
--- 1814,1820 ----
    free_dependence_relations (LOOP_VINFO_DDRS (loop_vinfo));
    VEC_free (gimple, heap, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo));
    VEC_free (ddr_p, heap, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo));
+   BITMAP_FREE (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo));
    slp_instances = LOOP_VINFO_SLP_INSTANCES (loop_vinfo);
    for (j = 0; VEC_iterate (slp_instance, slp_instances, j, instance); j++)
      vect_free_slp_instance (instance);
Index: trunk/gcc/tree-vectorizer.h
===================================================================
*** trunk.orig/gcc/tree-vectorizer.h	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-vectorizer.h	2009-01-23 16:48:50.000000000 +0100
*************** typedef struct _loop_vec_info {
*** 210,215 ****
--- 210,219 ----
    /* All data dependences in the loop.  */
    VEC (ddr_p, heap) *ddrs;
  
+   /* SSA_NAMEs representing variable strides in data references.
+      Candidates for a run-time stride check.  */
+   bitmap variable_strides;
+ 
    /* Data Dependence Relations defining address ranges that are candidates
       for a run-time aliasing check.  */
    VEC (ddr_p, heap) *may_alias_ddrs;
*************** typedef struct _loop_vec_info {
*** 254,259 ****
--- 258,264 ----
  #define LOOP_VINFO_LOC(L)             (L)->loop_line_number
  #define LOOP_VINFO_MAY_ALIAS_DDRS(L)  (L)->may_alias_ddrs
  #define LOOP_VINFO_STRIDED_STORES(L)  (L)->strided_stores
+ #define LOOP_VINFO_VARIABLE_STRIDES(L) (L)->variable_strides
  #define LOOP_VINFO_SLP_INSTANCES(L)   (L)->slp_instances
  #define LOOP_VINFO_SLP_UNROLLING_FACTOR(L) (L)->slp_unrolling_factor
  
*************** extern bitmap vect_memsyms_to_rename;
*** 707,713 ****
     divide by the vectorization factor, and to peel the first few iterations
     to force the alignment of data references in the loop.  */
  extern struct loop *slpeel_tree_peel_loop_to_edge 
!   (struct loop *, edge, tree, tree, bool, unsigned int, bool);
  extern void set_prologue_iterations (basic_block, tree,
  				     struct loop *, unsigned int);
  struct loop *tree_duplicate_loop_on_edge (struct loop *, edge);
--- 712,718 ----
     divide by the vectorization factor, and to peel the first few iterations
     to force the alignment of data references in the loop.  */
  extern struct loop *slpeel_tree_peel_loop_to_edge 
!   (struct loop *, edge, tree, tree, bool, unsigned int, bool, tree, gimple_seq);
  extern void set_prologue_iterations (basic_block, tree,
  				     struct loop *, unsigned int);
  struct loop *tree_duplicate_loop_on_edge (struct loop *, edge);
Index: trunk/gcc/tree-vect-transform.c
===================================================================
*** trunk.orig/gcc/tree-vect-transform.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-vect-transform.c	2009-01-23 16:48:50.000000000 +0100
*************** static tree get_initial_def_for_reductio
*** 65,72 ****
  
  /* Utility function dealing with loop peeling (not peeling itself).  */
  static void vect_generate_tmps_on_preheader 
!   (loop_vec_info, tree *, tree *, tree *);
! static tree vect_build_loop_niters (loop_vec_info);
  static void vect_update_ivs_after_vectorizer (loop_vec_info, tree, edge); 
  static tree vect_gen_niters_for_prolog_loop (loop_vec_info, tree);
  static void vect_update_init_of_dr (struct data_reference *, tree niters);
--- 65,72 ----
  
  /* Utility function dealing with loop peeling (not peeling itself).  */
  static void vect_generate_tmps_on_preheader 
!   (loop_vec_info, tree *, tree *, tree *, gimple_seq);
! static tree vect_build_loop_niters (loop_vec_info, gimple_seq);
  static void vect_update_ivs_after_vectorizer (loop_vec_info, tree, edge); 
  static tree vect_gen_niters_for_prolog_loop (loop_vec_info, tree);
  static void vect_update_init_of_dr (struct data_reference *, tree niters);
*************** vect_transform_stmt (gimple stmt, gimple
*** 7199,7205 ****
     on the loop preheader.  */
  
  static tree
! vect_build_loop_niters (loop_vec_info loop_vinfo)
  {
    tree ni_name, var;
    gimple_seq stmts = NULL;
--- 7199,7205 ----
     on the loop preheader.  */
  
  static tree
! vect_build_loop_niters (loop_vec_info loop_vinfo, gimple_seq seq)
  {
    tree ni_name, var;
    gimple_seq stmts = NULL;
*************** vect_build_loop_niters (loop_vec_info lo
*** 7214,7221 ****
    pe = loop_preheader_edge (loop);
    if (stmts)
      {
!       basic_block new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
!       gcc_assert (!new_bb);
      }
        
    return ni_name;
--- 7214,7226 ----
    pe = loop_preheader_edge (loop);
    if (stmts)
      {
!       if (seq)
! 	gimple_seq_add_seq (&seq, stmts);
!       else
! 	{
! 	  basic_block new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
! 	  gcc_assert (!new_bb);
! 	}
      }
        
    return ni_name;
*************** static void
*** 7234,7240 ****
  vect_generate_tmps_on_preheader (loop_vec_info loop_vinfo, 
  				 tree *ni_name_ptr,
  				 tree *ratio_mult_vf_name_ptr, 
! 				 tree *ratio_name_ptr)
  {
  
    edge pe;
--- 7239,7246 ----
  vect_generate_tmps_on_preheader (loop_vec_info loop_vinfo, 
  				 tree *ni_name_ptr,
  				 tree *ratio_mult_vf_name_ptr, 
! 				 tree *ratio_name_ptr,
! 				 gimple_seq cond_expr_stmt_list)
  {
  
    edge pe;
*************** vect_generate_tmps_on_preheader (loop_ve
*** 7254,7260 ****
    /* Generate temporary variable that contains 
       number of iterations loop executes.  */
  
!   ni_name = vect_build_loop_niters (loop_vinfo);
    log_vf = build_int_cst (TREE_TYPE (ni), exact_log2 (vf));
  
    /* Create: ratio = ni >> log2(vf) */
--- 7260,7266 ----
    /* Generate temporary variable that contains 
       number of iterations loop executes.  */
  
!   ni_name = vect_build_loop_niters (loop_vinfo, cond_expr_stmt_list);
    log_vf = build_int_cst (TREE_TYPE (ni), exact_log2 (vf));
  
    /* Create: ratio = ni >> log2(vf) */
*************** vect_generate_tmps_on_preheader (loop_ve
*** 7267,7275 ****
  
        stmts = NULL;
        ratio_name = force_gimple_operand (ratio_name, &stmts, true, var);
!       pe = loop_preheader_edge (loop);
!       new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
!       gcc_assert (!new_bb);
      }
         
    /* Create: ratio_mult_vf = ratio << log2 (vf).  */
--- 7273,7286 ----
  
        stmts = NULL;
        ratio_name = force_gimple_operand (ratio_name, &stmts, true, var);
!       if (cond_expr_stmt_list)
! 	gimple_seq_add_seq (&cond_expr_stmt_list, stmts);
!       else
! 	{
! 	  pe = loop_preheader_edge (loop);
! 	  new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
! 	  gcc_assert (!new_bb);
! 	}
      }
         
    /* Create: ratio_mult_vf = ratio << log2 (vf).  */
*************** vect_generate_tmps_on_preheader (loop_ve
*** 7284,7292 ****
        stmts = NULL;
        ratio_mult_vf_name = force_gimple_operand (ratio_mult_vf_name, &stmts,
  						 true, var);
!       pe = loop_preheader_edge (loop);
!       new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
!       gcc_assert (!new_bb);
      }
  
    *ni_name_ptr = ni_name;
--- 7295,7308 ----
        stmts = NULL;
        ratio_mult_vf_name = force_gimple_operand (ratio_mult_vf_name, &stmts,
  						 true, var);
!       if (cond_expr_stmt_list)
! 	gimple_seq_add_seq (&cond_expr_stmt_list, stmts);
!       else
! 	{
! 	  pe = loop_preheader_edge (loop);
! 	  new_bb = gsi_insert_seq_on_edge_immediate (pe, stmts);
! 	  gcc_assert (!new_bb);
! 	}
      }
  
    *ni_name_ptr = ni_name;
*************** conservative_cost_threshold (loop_vec_in
*** 7470,7476 ****
     NITERS / VECTORIZATION_FACTOR times (this value is placed into RATIO).  */
  
  static void 
! vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio)
  {
    tree ni_name, ratio_mult_vf_name;
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
--- 7486,7493 ----
     NITERS / VECTORIZATION_FACTOR times (this value is placed into RATIO).  */
  
  static void 
! vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio,
! 				tree cond_expr, gimple_seq cond_expr_stmt_list)
  {
    tree ni_name, ratio_mult_vf_name;
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
*************** vect_do_peeling_for_loop_bound (loop_vec
*** 7493,7499 ****
       ratio = ni_name / vf
       ratio_mult_vf_name = ratio * vf  */
    vect_generate_tmps_on_preheader (loop_vinfo, &ni_name,
! 				   &ratio_mult_vf_name, ratio);
  
    loop_num  = loop->num; 
  
--- 7510,7517 ----
       ratio = ni_name / vf
       ratio_mult_vf_name = ratio * vf  */
    vect_generate_tmps_on_preheader (loop_vinfo, &ni_name,
! 				   &ratio_mult_vf_name, ratio,
! 				   cond_expr_stmt_list);
  
    loop_num  = loop->num; 
  
*************** vect_do_peeling_for_loop_bound (loop_vec
*** 7501,7507 ****
       peeling for alignment.  */
    if (!VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
        && !VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo))
!       && !LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo))
      {
        check_profitability = true;
  
--- 7519,7526 ----
       peeling for alignment.  */
    if (!VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
        && !VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo))
!       && !LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo)
!       && !cond_expr)
      {
        check_profitability = true;
  
*************** vect_do_peeling_for_loop_bound (loop_vec
*** 7514,7520 ****
  
    new_loop = slpeel_tree_peel_loop_to_edge (loop, single_exit (loop),
                                              ratio_mult_vf_name, ni_name, false,
!                                             th, check_profitability);
    gcc_assert (new_loop);
    gcc_assert (loop_num == loop->num);
  #ifdef ENABLE_CHECKING
--- 7533,7540 ----
  
    new_loop = slpeel_tree_peel_loop_to_edge (loop, single_exit (loop),
                                              ratio_mult_vf_name, ni_name, false,
!                                             th, check_profitability,
! 					    cond_expr, cond_expr_stmt_list);
    gcc_assert (new_loop);
    gcc_assert (loop_num == loop->num);
  #ifdef ENABLE_CHECKING
*************** vect_do_peeling_for_alignment (loop_vec_
*** 7738,7744 ****
  
    initialize_original_copy_tables ();
  
!   ni_name = vect_build_loop_niters (loop_vinfo);
    niters_of_prolog_loop = vect_gen_niters_for_prolog_loop (loop_vinfo, ni_name);
    
  
--- 7758,7764 ----
  
    initialize_original_copy_tables ();
  
!   ni_name = vect_build_loop_niters (loop_vinfo, NULL);
    niters_of_prolog_loop = vect_gen_niters_for_prolog_loop (loop_vinfo, ni_name);
    
  
*************** vect_do_peeling_for_alignment (loop_vec_
*** 7759,7765 ****
    new_loop =
      slpeel_tree_peel_loop_to_edge (loop, loop_preheader_edge (loop),
  				   niters_of_prolog_loop, ni_name, true,
! 				   th, check_profitability);
  
    gcc_assert (new_loop);
  #ifdef ENABLE_CHECKING
--- 7779,7785 ----
    new_loop =
      slpeel_tree_peel_loop_to_edge (loop, loop_preheader_edge (loop),
  				   niters_of_prolog_loop, ni_name, true,
! 				   th, check_profitability, NULL_TREE, NULL);
  
    gcc_assert (new_loop);
  #ifdef ENABLE_CHECKING
*************** vect_create_cond_for_align_checks (loop_
*** 7909,7914 ****
--- 7929,7981 ----
      *cond_expr = part_cond_expr;
  }
  
+ /* Function vect_create_cond_for_stride_checks.
+ 
+    Create a conditional expression that represents the stride checks for
+    all of the stride SSA_NAMEs used in data references (array element
+    references) whose stride must be checked at runtime.
+ 
+    Input:
+    COND_EXPR  - input conditional expression.  New conditions will be chained
+                 with logical AND operation.
+    LOOP_VINFO - on field of the loop information is used.
+                 LOOP_VINFO_VARIABLE_STRIDES is a bitmap of SSA_NAMEs to be
+ 		checked.
+ 
+    Output:
+    COND_EXPR_STMT_LIST - statements needed to construct the conditional
+                          expression.
+    The returned value is the conditional expression to be used in the if
+    statement that controls which version of the loop gets executed at runtime.
+ 
+    The stride we do versioning for is currently specified by a compile-time
+    param.  In future the stride should be chosen by information from
+    profile-feedback.  */
+ 
+ static void
+ vect_create_cond_for_stride_checks (loop_vec_info loop_vinfo,
+ 				    tree *cond_expr)
+ {
+   bitmap_iterator bi;
+   unsigned int i;
+   HOST_WIDE_INT stride;
+ 
+   stride = PARAM_VALUE (PARAM_VECT_VERSION_FOR_STRIDE_VALUE);
+ 
+   EXECUTE_IF_SET_IN_BITMAP (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo), 0, i, bi)
+     {
+       tree name = ssa_name (i);
+       tree cond = fold_build2 (EQ_EXPR, boolean_type_node,
+ 			       name,
+ 			       build_int_cst (TREE_TYPE (name), stride));
+       if (*cond_expr)
+ 	*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+ 				  *cond_expr, cond);
+       else
+ 	*cond_expr = cond;
+     }
+ }
+ 
  /* Function vect_vfa_segment_size.
  
     Create an expression that computes the size of segment
*************** vect_create_cond_for_alias_checks (loop_
*** 8076,8087 ****
     cost model initially.  */
  
  static void
! vect_loop_versioning (loop_vec_info loop_vinfo)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *nloop;
-   tree cond_expr = NULL_TREE;
-   gimple_seq cond_expr_stmt_list = NULL;
    basic_block condition_bb;
    gimple_stmt_iterator gsi, cond_exp_gsi;
    basic_block merge_bb;
--- 8143,8153 ----
     cost model initially.  */
  
  static void
! vect_loop_versioning (loop_vec_info loop_vinfo, bool do_versioning,
! 		      tree *cond_expr, gimple_seq *cond_expr_stmt_list)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *nloop;
    basic_block condition_bb;
    gimple_stmt_iterator gsi, cond_exp_gsi;
    basic_block merge_bb;
*************** vect_loop_versioning (loop_vec_info loop
*** 8101,8129 ****
    th = conservative_cost_threshold (loop_vinfo,
  				    min_profitable_iters);
  
!   cond_expr =
!     build2 (GT_EXPR, boolean_type_node, scalar_loop_iters, 
! 	    build_int_cst (TREE_TYPE (scalar_loop_iters), th));
  
!   cond_expr = force_gimple_operand (cond_expr, &cond_expr_stmt_list,
! 				    false, NULL_TREE);
  
    if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)))
!       vect_create_cond_for_align_checks (loop_vinfo, &cond_expr,
! 					 &cond_expr_stmt_list);
  
    if (VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)))
!     vect_create_cond_for_alias_checks (loop_vinfo, &cond_expr, 
! 				       &cond_expr_stmt_list);
  
!   cond_expr =
!     fold_build2 (NE_EXPR, boolean_type_node, cond_expr, integer_zero_node);
!   cond_expr =
!     force_gimple_operand (cond_expr, &gimplify_stmt_list, true, NULL_TREE);
!   gimple_seq_add_seq (&cond_expr_stmt_list, gimplify_stmt_list);
  
    initialize_original_copy_tables ();
!   nloop = loop_version (loop, cond_expr, &condition_bb,
  			prob, prob, REG_BR_PROB_BASE - prob, true);
    free_original_copy_tables();
  
--- 8167,8200 ----
    th = conservative_cost_threshold (loop_vinfo,
  				    min_profitable_iters);
  
!   *cond_expr =
!     fold_build2 (GT_EXPR, boolean_type_node, scalar_loop_iters,
! 		 build_int_cst (TREE_TYPE (scalar_loop_iters), th));
  
!   *cond_expr = force_gimple_operand (*cond_expr, cond_expr_stmt_list,
! 				     false, NULL_TREE);
  
    if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)))
!       vect_create_cond_for_align_checks (loop_vinfo, cond_expr,
! 					 cond_expr_stmt_list);
  
    if (VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)))
!     vect_create_cond_for_alias_checks (loop_vinfo, cond_expr,
! 				       cond_expr_stmt_list);
  
!   if (!bitmap_empty_p (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo)))
!     vect_create_cond_for_stride_checks (loop_vinfo, cond_expr);
! 
!   *cond_expr =
!     fold_build2 (NE_EXPR, boolean_type_node, *cond_expr, integer_zero_node);
!   *cond_expr =
!     force_gimple_operand (*cond_expr, &gimplify_stmt_list, true, NULL_TREE);
!   gimple_seq_add_seq (cond_expr_stmt_list, gimplify_stmt_list);
!   if (!do_versioning)
!     return;
  
    initialize_original_copy_tables ();
!   nloop = loop_version (loop, *cond_expr, &condition_bb,
  			prob, prob, REG_BR_PROB_BASE - prob, true);
    free_original_copy_tables();
  
*************** vect_loop_versioning (loop_vec_info loop
*** 8154,8164 ****
    /* End loop-exit-fixes after versioning.  */
  
    update_ssa (TODO_update_ssa);
!   if (cond_expr_stmt_list)
      {
        cond_exp_gsi = gsi_last_bb (condition_bb);
!       gsi_insert_seq_before (&cond_exp_gsi, cond_expr_stmt_list, GSI_SAME_STMT);
      }
  }
  
  /* Remove a group of stores (for SLP or interleaving), free their 
--- 8225,8238 ----
    /* End loop-exit-fixes after versioning.  */
  
    update_ssa (TODO_update_ssa);
!   if (*cond_expr_stmt_list)
      {
        cond_exp_gsi = gsi_last_bb (condition_bb);
!       gsi_insert_seq_before (&cond_exp_gsi, *cond_expr_stmt_list,
! 			     GSI_SAME_STMT);
!       *cond_expr_stmt_list = NULL;
      }
+   *cond_expr = NULL_TREE;
  }
  
  /* Remove a group of stores (for SLP or interleaving), free their 
*************** vect_transform_loop (loop_vec_info loop_
*** 8320,8342 ****
    bool strided_store;
    bool slp_scheduled = false;
    unsigned int nunits;
  
    if (vect_print_dump_info (REPORT_DETAILS))
      fprintf (vect_dump, "=== vec_transform_loop ===");
  
-   if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
-       || VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)))
-     vect_loop_versioning (loop_vinfo);
- 
-   /* CHECKME: we wouldn't need this if we called update_ssa once
-      for all loops.  */
-   bitmap_zero (vect_memsyms_to_rename);
- 
    /* Peel the loop if there are data refs with unknown alignment.
       Only one data ref with unknown store is allowed.  */
  
    if (LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo))
      vect_do_peeling_for_alignment (loop_vinfo);
    
    /* If the loop has a symbolic number of iterations 'n' (i.e. it's not a
       compile time constant), or it is a constant that doesn't divide by the
--- 8394,8427 ----
    bool strided_store;
    bool slp_scheduled = false;
    unsigned int nunits;
+   tree cond_expr = NULL_TREE;
+   gimple_seq cond_expr_stmt_list = NULL;
+   bool do_peeling_for_loop_bound;
  
    if (vect_print_dump_info (REPORT_DETAILS))
      fprintf (vect_dump, "=== vec_transform_loop ===");
  
    /* Peel the loop if there are data refs with unknown alignment.
       Only one data ref with unknown store is allowed.  */
  
    if (LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo))
      vect_do_peeling_for_alignment (loop_vinfo);
+ 
+   do_peeling_for_loop_bound
+     = (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+        || (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+ 	   && LOOP_VINFO_INT_NITERS (loop_vinfo) % vectorization_factor != 0));
+ 
+   if (VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
+       || VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo))
+       || !bitmap_empty_p (LOOP_VINFO_VARIABLE_STRIDES (loop_vinfo)))
+     vect_loop_versioning (loop_vinfo,
+ 			  !do_peeling_for_loop_bound,
+ 			  &cond_expr, &cond_expr_stmt_list);
+ 
+   /* CHECKME: we wouldn't need this if we called update_ssa once
+      for all loops.  */
+   bitmap_zero (vect_memsyms_to_rename);
    
    /* If the loop has a symbolic number of iterations 'n' (i.e. it's not a
       compile time constant), or it is a constant that doesn't divide by the
*************** vect_transform_loop (loop_vec_info loop_
*** 8346,8355 ****
       will remain scalar and will compute the remaining (n%VF) iterations.
       (VF is the vectorization factor).  */
  
!   if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
!       || (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
!           && LOOP_VINFO_INT_NITERS (loop_vinfo) % vectorization_factor != 0))
!     vect_do_peeling_for_loop_bound (loop_vinfo, &ratio);
    else
      ratio = build_int_cst (TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo)),
  		LOOP_VINFO_INT_NITERS (loop_vinfo) / vectorization_factor);
--- 8431,8439 ----
       will remain scalar and will compute the remaining (n%VF) iterations.
       (VF is the vectorization factor).  */
  
!   if (do_peeling_for_loop_bound)
!     vect_do_peeling_for_loop_bound (loop_vinfo, &ratio,
! 				    cond_expr, cond_expr_stmt_list);
    else
      ratio = build_int_cst (TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo)),
  		LOOP_VINFO_INT_NITERS (loop_vinfo) / vectorization_factor);
Index: trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-stride-1.f90
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-stride-1.f90	2009-01-23 16:48:50.000000000 +0100
***************
*** 0 ****
--- 1,17 ----
+ ! { dg-do compile }
+ 
+ subroutine to_product_of(self,a,b)
+   real(kind=8), dimension(:,:) :: self
+   real(kind=8), dimension(:,:), intent(in) :: a, b
+   integer(kind=kind(1)) :: dim1, dim2
+   dim1 = size(self,1)
+   dim2 = size(self,2)
+   do i = 1,dim1
+     do j = 1,dim2
+       self(i,j) = sum(a(i,:)*b(:,j))
+     end do
+   end do
+ end subroutine
+ 
+ ! { dg-final { scan-tree-dump "vectorized 1 loop" "vect" } }
+ ! { dg-final { cleanup-tree-dump "vect" } }
Index: trunk/gcc/Makefile.in
===================================================================
*** trunk.orig/gcc/Makefile.in	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/Makefile.in	2009-01-23 16:48:50.000000000 +0100
*************** tree-nrv.o : tree-nrv.c $(CONFIG_H) $(SY
*** 2135,2141 ****
  tree-ssa-copy.o : tree-ssa-copy.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) output.h $(DIAGNOSTIC_H) \
     $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
!    $(BASIC_BLOCK_H) tree-pass.h langhooks.h tree-ssa-propagate.h $(FLAGS_H)
  tree-ssa-propagate.o : tree-ssa-propagate.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) output.h \
     $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
--- 2135,2142 ----
  tree-ssa-copy.o : tree-ssa-copy.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) output.h $(DIAGNOSTIC_H) \
     $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
!    $(BASIC_BLOCK_H) tree-pass.h langhooks.h tree-ssa-propagate.h $(FLAGS_H) \
!    $(CFGLOOP_H)
  tree-ssa-propagate.o : tree-ssa-propagate.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) output.h \
     $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
Index: trunk/gcc/tree-ssa-copy.c
===================================================================
*** trunk.orig/gcc/tree-ssa-copy.c	2009-01-23 16:47:53.000000000 +0100
--- trunk/gcc/tree-ssa-copy.c	2009-01-23 16:48:50.000000000 +0100
*************** along with GCC; see the file COPYING3.
*** 37,42 ****
--- 37,43 ----
  #include "tree-pass.h"
  #include "tree-ssa-propagate.h"
  #include "langhooks.h"
+ #include "cfgloop.h"
  
  /* This file implements the copy propagation pass and provides a
     handful of interfaces for performing const/copy propagation and
*************** init_copy_prop (void)
*** 991,997 ****
            tree def;
  
  	  def = gimple_phi_result (phi);
! 	  if (!is_gimple_reg (def))
              prop_set_simulate_again (phi, false);
  	  else
              prop_set_simulate_again (phi, true);
--- 992,1004 ----
            tree def;
  
  	  def = gimple_phi_result (phi);
! 	  if (!is_gimple_reg (def)
! 	      /* In loop-closed SSA form do not copy-propagate through
! 	         PHI nodes.  Technically this is only needed for loop
! 		 exit PHIs, but this is difficult to query.  */
! 	      || (current_loops
! 		  && gimple_phi_num_args (phi) == 1
! 		  && loops_state_satisfies_p (LOOP_CLOSED_SSA)))
              prop_set_simulate_again (phi, false);
  	  else
              prop_set_simulate_again (phi, true);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]