This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Improve tree-ssa-forwprop


optimization/14753 shows a trivial case that the forward propagator ought
to be handling.    Specifically if we have a single use variable which is
a boolean set from the cast of an integral type or an integral set from
the cast of a boolean, then we can forward propagate.

void bar (void);

void
foo (unsigned int a)
{
  if ((a >> 5) & 1)
    bar ();
}


Which results in:

foo (a)
{
  _Bool T.2;
  unsigned int T.1;
  unsigned int T.0;

<bb 0>:
  T.0_2 = a_1 >> 5;
  T.1_3 = T.0_2 & 1;
  T.2_4 = (_Bool)T.1_3;
  if (T.2_4) goto <L0>; else goto <L1>;

<L0>:;
  bar () [tail call];

<L1>:;
  return;

}


We can instead forward propagate T.1_3 for T.2_4 in the conditional
resulting in:

foo (a)
{
  _Bool T.2;
  unsigned int T.1;
  unsigned int T.0;

<bb 0>:
  T.0_2 = a_1 >> 5;
  T.1_3 = T.0_2 & 1;
  if (T.1_3) goto <L0>; else goto <L1>;

<L0>:;
  bar () [tail call];

<L1>:;
  return;

}

Handling this is a trivial extension to how we handle TRUTH_NOT_EXPR.  It's
also a necessary step to address ....


optimization/14731 shows another (more interesting) case where the forward
propagator could be improved.  Basically the forward propagator didn't
handle cascading propagation opportunities.  By not handling cascades we
also don't get the opportunity to thread the jump.

#ifdef __cplusplus
#define _Bool bool
#endif

int link_error(void);
int s(void);

int t(int i)
{
  _Bool g = i == 4;
 int h = g;
 _Bool j = h;
 int k = j;
 _Bool l = k == 0;
 _Bool o = !l;
 int m = o;

 if (m)
  if (i != 4)
   return link_error();
 return 0;
}

To fix this I've introduced a new worklist of conditionals to examine and
a loop to drain that worklist.  When we actually perform a forward propagation
we place the modified conditional on the worklist so that it gets reexamined
the next iteration.

Bootstrapped and regression tested on i686-pc-linux-gnu.

	* tree-ssa-forwprop.c (record_single_argument_cond_exprs): Accept
	new parameters for the statement and variable worklist as well
	as a bitmap of interesting SSA_NAMEs.  Walk over the statement
	worklist recording interesting variables in the variable worklist
	and bitmap.  Handle casts between integral and boolean types.
	(substitute_single_use_vars): Accept new parameters for the statement
	and variable worklist.  When a substitution is made add a new
	entry to the statement worklist.  Handle casts between integral
	and boolean types.
	(tree_ssa_forward_propagate_single_use_vars): Rework to pass
	worklists to children.  Iterate until the statement worklist
	is empty.

	* gcc.dg/tree-ssa/20040513-1.c: New test.
	* gcc.dg/tree-ssa/20040513-2.c: New test.

Index: tree-ssa-forwprop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-forwprop.c,v
retrieving revision 2.1
diff -c -p -r2.1 tree-ssa-forwprop.c
*** tree-ssa-forwprop.c	13 May 2004 06:39:49 -0000	2.1
--- tree-ssa-forwprop.c	13 May 2004 20:42:01 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 80,89 ****
--- 80,112 ----
     For these cases, we propagate A into all, possibly more than one,
     COND_EXPRs that use X.
  
+    Or
+ 
+      bb0:
+        x = (typecast) a
+        if (x) goto ... else goto ...
+ 
+    Will be transformed into:
+ 
+      bb0:
+         if (a != 0) goto ... else goto ...
+ 
+    (Assuming a is an integral type and x is a boolean or x is an
+     integral and a is a boolean.)
+ 
+    Similarly for the tests (x == 0), (x != 0), (x == 1) and (x != 1).
+    For these cases, we propagate A into all, possibly more than one,
+    COND_EXPRs that use X.
+ 
     In addition to eliminating the variable and the statement which assigns
     a value to the variable, we may be able to later thread the jump without
     adding insane complexity in the dominator optimizer. 
  
+    Also note these transformations can cascade.  We handle this by having
+    a worklist of COND_EXPR statements to examine.  As we make a change to
+    a statement, we put it back on the worklist to examine on the next
+    iteration of the main loop.
+ 
     This will (of course) be extended as other needs arise.  */
  
  /* Bitmap of variables for which we want immediate uses.  This is set
*************** static bitmap vars;
*** 92,99 ****
  
  static bool need_imm_uses_for (tree);
  static void tree_ssa_forward_propagate_single_use_vars (void);
! static varray_type record_single_argument_cond_exprs (void);
! static void substitute_single_use_vars (varray_type);
  
  /* Function indicating whether we ought to include information for 'var'
     when calculating immediate uses.  */
--- 115,124 ----
  
  static bool need_imm_uses_for (tree);
  static void tree_ssa_forward_propagate_single_use_vars (void);
! static void record_single_argument_cond_exprs (varray_type,
! 					       varray_type *,
! 					       bitmap);
! static void substitute_single_use_vars (varray_type *, varray_type);
  
  /* Function indicating whether we ought to include information for 'var'
     when calculating immediate uses.  */
*************** need_imm_uses_for (tree var)
*** 115,138 ****
     filter out here, the faster this pass will run since its runtime is
     dominated by the time to build immediate uses.  */
  
! static varray_type
! record_single_argument_cond_exprs (void)
! {
!   basic_block bb;
!   varray_type retval;
! 
!   vars = BITMAP_XMALLOC ();
! 
!   VARRAY_TREE_INIT (retval, 10, "forward propagation vars");
  
    /* The first pass over the blocks gathers the set of variables we need
       immediate uses for as well as the set of interesting COND_EXPRs.
  
       A simpler implementation may be appropriate if/when we have a lower
       overhead means of getting immediate use information.  */
!   FOR_EACH_BB (bb)
      {
!       tree last = last_stmt (bb);
  
        /* See if this block ends in a COND_EXPR.  */
        if (last && TREE_CODE (last) == COND_EXPR)
--- 140,161 ----
     filter out here, the faster this pass will run since its runtime is
     dominated by the time to build immediate uses.  */
  
! static void
! record_single_argument_cond_exprs (varray_type cond_worklist,
! 				   varray_type *vars_worklist,
! 				   bitmap vars)
  
+ {
    /* The first pass over the blocks gathers the set of variables we need
       immediate uses for as well as the set of interesting COND_EXPRs.
  
       A simpler implementation may be appropriate if/when we have a lower
       overhead means of getting immediate use information.  */
!   while (VARRAY_ACTIVE_SIZE (cond_worklist) > 0)
      {
!       tree last = VARRAY_TOP_TREE (cond_worklist);
! 
!       VARRAY_POP (cond_worklist);
  
        /* See if this block ends in a COND_EXPR.  */
        if (last && TREE_CODE (last) == COND_EXPR)
*************** record_single_argument_cond_exprs (void)
*** 225,230 ****
--- 248,274 ----
  			      && !is_gimple_min_invariant (def_rhs))
  			    continue;
  			}
+ 
+ 		      /* If TEST_VAR was set from a cast of an integer type
+ 			 to a boolean type or a cast of a boolean to an
+ 			 integral, then it is interesting.  */
+ 		      else if (TREE_CODE (def_rhs) == NOP_EXPR
+ 			       || TREE_CODE (def_rhs) == CONVERT_EXPR)
+ 			{
+ 			  tree outer_type;
+ 			  tree inner_type;
+ 
+ 			  outer_type = TREE_TYPE (def_rhs);
+ 			  inner_type = TREE_TYPE (TREE_OPERAND (def_rhs, 0));
+ 
+ 			  if ((TREE_CODE (outer_type) == BOOLEAN_TYPE
+ 			       && INTEGRAL_TYPE_P (inner_type))
+ 			      || (TREE_CODE (inner_type) == BOOLEAN_TYPE
+ 				  && INTEGRAL_TYPE_P (outer_type)))
+ 			    ;
+ 			  else
+ 			    continue;
+ 			}
  		      else
  			continue;
  		    }
*************** record_single_argument_cond_exprs (void)
*** 232,244 ****
  		    continue;
  
  		  /* All the tests passed, record TEST_VAR as interesting.  */
! 		  VARRAY_PUSH_TREE (retval, test_var);
  		  bitmap_set_bit (vars, SSA_NAME_VERSION (test_var));
  		}
  	    }
  	}
      }
-   return retval;
  }
  
  /* Given FORWPROP_DATA containing SSA_NAMEs which are used in COND_EXPRs
--- 276,287 ----
  		    continue;
  
  		  /* All the tests passed, record TEST_VAR as interesting.  */
! 		  VARRAY_PUSH_TREE (*vars_worklist, test_var);
  		  bitmap_set_bit (vars, SSA_NAME_VERSION (test_var));
  		}
  	    }
  	}
      }
  }
  
  /* Given FORWPROP_DATA containing SSA_NAMEs which are used in COND_EXPRs
*************** record_single_argument_cond_exprs (void)
*** 247,264 ****
     SSA_NAME used in the COND_EXPR.  */
    
  static void
! substitute_single_use_vars (varray_type forwprop_data)
  {
!   unsigned int i;
! 
!   for (i = 0; i < VARRAY_ACTIVE_SIZE (forwprop_data); i++)
      {
!       tree test_var = VARRAY_TREE (forwprop_data, i);
        tree def = SSA_NAME_DEF_STMT (test_var);
        dataflow_t df;
        int j, num_uses, propagated_uses;
        block_stmt_iterator bsi;
  
        /* Now compute the immediate uses of TEST_VAR.  */
        df = get_immediate_uses (def);
        num_uses = num_immediate_uses (df);
--- 290,308 ----
     SSA_NAME used in the COND_EXPR.  */
    
  static void
! substitute_single_use_vars (varray_type *cond_worklist,
! 			    varray_type vars_worklist)
  {
!   while (VARRAY_ACTIVE_SIZE (vars_worklist) > 0)
      {
!       tree test_var = VARRAY_TOP_TREE (vars_worklist);
        tree def = SSA_NAME_DEF_STMT (test_var);
        dataflow_t df;
        int j, num_uses, propagated_uses;
        block_stmt_iterator bsi;
  
+       VARRAY_POP (vars_worklist);
+ 
        /* Now compute the immediate uses of TEST_VAR.  */
        df = get_immediate_uses (def);
        num_uses = num_immediate_uses (df);
*************** substitute_single_use_vars (varray_type 
*** 345,365 ****
  	    }
  	  else
  	    {
! 	      /* TEST_VAR was set from a TRUTH_NOT_EXPR.  */
  	      if (cond_code == SSA_NAME
  		  || (cond_code == NE_EXPR
  		      && integer_zerop (TREE_OPERAND (cond, 1)))
  		  || (cond_code == EQ_EXPR
  		      && integer_onep (TREE_OPERAND (cond, 1))))
! 		new_cond = build (EQ_EXPR, boolean_type_node,
! 				  TREE_OPERAND (def_rhs, 0),
! 				  convert (TREE_TYPE (def_rhs),
! 					   integer_zero_node));
  	      else
! 		new_cond = build (NE_EXPR, boolean_type_node,
! 				  TREE_OPERAND (def_rhs, 0),
! 				  convert (TREE_TYPE (def_rhs),
! 					   integer_zero_node));
  	    }
  
  	  /* Dump details.  */
--- 389,418 ----
  	    }
  	  else
  	    {
! 	      bool invert = false;
! 	      enum tree_code new_code;
! 
! 	      /* TEST_VAR was set from a TRUTH_NOT_EXPR or a NOP_EXPR.  */
! 	      if (def_rhs_code == TRUTH_NOT_EXPR)
! 		invert = true;
! 		
  	      if (cond_code == SSA_NAME
  		  || (cond_code == NE_EXPR
  		      && integer_zerop (TREE_OPERAND (cond, 1)))
  		  || (cond_code == EQ_EXPR
  		      && integer_onep (TREE_OPERAND (cond, 1))))
! 		new_code = NE_EXPR;
  	      else
! 		new_code = EQ_EXPR;
! 
! 	      if (invert)
! 		new_code = (new_code == EQ_EXPR ? NE_EXPR  : EQ_EXPR);
! 
! 	      new_cond = build (new_code, 
! 				boolean_type_node,
! 				TREE_OPERAND (def_rhs, 0),
! 				convert (TREE_TYPE (def_rhs),
! 					 integer_zero_node));
  	    }
  
  	  /* Dump details.  */
*************** substitute_single_use_vars (varray_type 
*** 376,381 ****
--- 429,435 ----
  	  COND_EXPR_COND (cond_stmt) = new_cond;
  	  modify_stmt (cond_stmt);
  	  propagated_uses++;
+ 	  VARRAY_PUSH_TREE (*cond_worklist, cond_stmt);
  	}
  
        /* If we propagated into all the uses, then we can delete DEF.
*************** substitute_single_use_vars (varray_type 
*** 400,424 ****
  static void
  tree_ssa_forward_propagate_single_use_vars (void)
  {
!   varray_type forwprop_data;
  
!   /* First get a list of all the interesting COND_EXPRs and potential single
!      use variables which feed those COND_EXPRs.  */
!   forwprop_data = record_single_argument_cond_exprs ();
  
!   /* Now compute immediate uses for all the variables we care about.  */
!   compute_immediate_uses (TDFA_USE_OPS, need_imm_uses_for);
  
!   /* We are done with the VARS bitmap and other dataflow information.  */
!   BITMAP_XFREE (vars);
!   vars = NULL;
  
!   /* And optimize.  */
!   substitute_single_use_vars (forwprop_data);
  
    /* All done.  Clean up.  */
!   free_df ();
!   VARRAY_CLEAR (forwprop_data);
  }
  
  
--- 454,505 ----
  static void
  tree_ssa_forward_propagate_single_use_vars (void)
  {
!   basic_block bb;
!   varray_type vars_worklist, cond_worklist;
  
!   vars = BITMAP_XMALLOC ();
!   VARRAY_TREE_INIT (vars_worklist, 10, "VARS worklist");
!   VARRAY_TREE_INIT (cond_worklist, 10, "COND worklist");
  
!   /* Prime the COND_EXPR worklist by placing all the COND_EXPRs on the
!      worklist.  */
!   FOR_EACH_BB (bb)
!     {
!       tree last = last_stmt (bb);
!       if (last && TREE_CODE (last) == COND_EXPR)
! 	VARRAY_PUSH_TREE (cond_worklist, last);
!     }
  
!   while (VARRAY_ACTIVE_SIZE (cond_worklist) > 0)
!     {
!       /* First get a list of all the interesting COND_EXPRs and potential
! 	 single use variables which feed those COND_EXPRs.  This will drain
! 	 COND_WORKLIST and initialize VARS_WORKLIST.  */
!       record_single_argument_cond_exprs (cond_worklist, &vars_worklist, 
vars);
  
!       if (VARRAY_ACTIVE_SIZE (vars_worklist) > 0)
! 	{
! 	  /* Now compute immediate uses for all the variables we care about.  */
! 	  compute_immediate_uses (TDFA_USE_OPS, need_imm_uses_for);
! 
! 	  /* We've computed immediate uses, so we can/must clear the VARS
! 	     bitmap for the next iteration.  */
! 	  bitmap_clear (vars);
! 
! 	  /* And optimize.  This will drain VARS_WORKLIST and initialize
! 	     COND_WORKLIST for the next iteration.  */
! 	  substitute_single_use_vars (&cond_worklist, vars_worklist);
! 
! 	  /* We do not incrementally update the dataflow information
! 	     so we must free it here and recompute the necessary bits
! 	     on the next iteration.  If this turns out to be expensive,
! 	     methods for incrementally updating the dataflow are known.  */
! 	  free_df ();
! 	}
!     }
  
    /* All done.  Clean up.  */
!   BITMAP_XFREE (vars);
  }
  
  
Index: testsuite/gcc.dg/tree-ssa/20040513-1.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/20040513-1.c
diff -N testsuite/gcc.dg/tree-ssa/20040513-1.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tree-ssa/20040513-1.c	13 May 2004 20:44:44 -0000
***************
*** 0 ****
--- 1,18 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O1 -fdump-tree-forwprop1" } */
+ void bar (void);
+ 
+ void
+ foo (unsigned int a)
+ {
+   if ((a >> 5) & 1)
+     bar ();
+ }
+ 
+ 
+ 
+ /* There should be no casts to a _Bool since we can use the temporary
+    holding (a>>5)&1 directly.  */
+ /* { dg-final { scan-tree-dump-times "\\(_Bool\\)" 0 "forwprop1"} } */
+                                                                              

+ 
Index: testsuite/gcc.dg/tree-ssa/20040513-2.c
===================================================================
RCS file: testsuite/gcc.dg/tree-ssa/20040513-2.c
diff -N testsuite/gcc.dg/tree-ssa/20040513-2.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- testsuite/gcc.dg/tree-ssa/20040513-2.c	13 May 2004 20:44:44 -0000
***************
*** 0 ****
--- 1,24 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O1 -fdump-tree-dom2" } */
+ int link_error(void);
+ int s(void);
+ 
+ int t(int i)
+ {
+   _Bool g = i == 4;
+  int h = g;
+  _Bool j = h;
+  int k = j;
+  _Bool l = k == 0;
+  _Bool o = !l;
+  int m = o;
+ 
+  if (m)
+   if (i != 4)
+    return link_error();
+  return 0;
+ }
+ 
+ /* There should be no link_error calls, if there is any, the
+    optimization has failed */
+ /* { dg-final { scan-tree-dump-times "link_error" 0 "dom2"} } */








Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]