Fix PR37916 (unnecessary spilling)

Michael Matz matz@suse.de
Tue Nov 20 13:42:00 GMT 2018


Hi,

this bug report is about cris generating worse code since tree-ssa.  The 
effect is also visible on x86-64.  The symptom is that the work horse of 
adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not, 
and those spills go away with -fno-tree-reassoc.

The underlying reason for the spills is register pressure, which could 
either be rectified by the pressure aware scheduling (which cris doesn't 
have), or by simply not generating high pressure code to begin with.  In 
this case it's TER which ultimately causes the register pressure to 
increase, and there are many plans in people minds how to fix this (make 
TER regpressure aware, do some regpressure scheduling on gimple, or even 
more ambitious things), but this patch doesn't tackle this.  Instead it 
makes reassoc not generate the situation which causes TER to run wild.

TER increasing register pressure is a long standing problem and so it has 
some heuristics to avoid that.  One wobbly heuristic is that it doesn't 
TER expressions together that have the same base variable as their LHSs.
But reassoc generates only anonymous SSA names for its temporary 
subexpressions, so that TER heuristic doesn't apply.  In this testcase 
it's even the case that reassoc doesn't really change much code (one 
addition moves from the end to the beginning of the inner loop), so that 
whole rewriting is even pointless.

In any case, let's use copy_ssa_name instead of make_ssa_name, when we 
have an obvious LHS; that leads to TER making the same decisions with and 
without -fno-tree-reassoc, leading to the same register pressure and no
spills.

On x86-64 the effect is:
  before patch: 48 bytes stackframe, 24 stack 
    accesses (most of them in the loops), 576 bytes codesize;
  after patch: no stack frame, no stack accesses, 438 bytes codesize

On cris:
  before patch: 64 bytes stack frame, 27 stack access in loops, size of .s 
    145 lines
  after patch: 20 bytes stack frame (as it uses callee saved regs, which 
    is also complained about in the bug report), but no stack accesses
    in loops, size of .s: 125 lines

I'm wondering about testcase: should I add an x86-64 specific that tests 
for no stack accesses, or would that be too constraining in the future?

Regstrapped on x86-64-linux, no regressions.  Okay for trunk?


Ciao,
Michael.

	PR tree-optimization/37916
	* tree-ssa-reassoc.c (make_new_ssa_for_def): Use copy_ssa_name.
	(rewrite_expr_tree, linearize_expr, negate_value,
	repropagate_negates, attempt_builtin_copysign,
	reassociate_bb): Likewise.

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 971d926e7895..339c3d4e447f 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1182,7 +1182,7 @@ make_new_ssa_for_def (gimple *stmt, enum tree_code opcode, tree op)
   tree new_lhs, new_debug_lhs = NULL_TREE;
   tree lhs = gimple_get_lhs (stmt);
 
-  new_lhs = make_ssa_name (TREE_TYPE (lhs));
+  new_lhs = copy_ssa_name (lhs);
   gimple_set_lhs (stmt, new_lhs);
 
   /* Also need to update GIMPLE_DEBUGs.  */
@@ -4512,7 +4512,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
 	    {
 	      gimple *insert_point
 		= find_insert_point (stmt, oe1->op, oe2->op);
-	      lhs = make_ssa_name (TREE_TYPE (lhs));
+	      lhs = copy_ssa_name (lhs);
 	      stmt
 		= gimple_build_assign (lhs, gimple_assign_rhs_code (stmt),
 				       oe1->op, oe2->op);
@@ -4583,7 +4583,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
 	  unsigned int uid = gimple_uid (stmt);
 	  gimple *insert_point = find_insert_point (stmt, new_rhs1, oe->op);
 
-	  lhs = make_ssa_name (TREE_TYPE (lhs));
+	  lhs = copy_ssa_name (lhs);
 	  stmt = gimple_build_assign (lhs, gimple_assign_rhs_code (stmt),
 				      new_rhs1, oe->op);
 	  gimple_set_uid (stmt, uid);
@@ -4820,7 +4820,7 @@ linearize_expr (gimple *stmt)
   gsi = gsi_for_stmt (stmt);
 
   gimple_assign_set_rhs2 (stmt, gimple_assign_rhs1 (binrhs));
-  binrhs = gimple_build_assign (make_ssa_name (TREE_TYPE (lhs)),
+  binrhs = gimple_build_assign (copy_ssa_name (lhs),
 				gimple_assign_rhs_code (binrhs),
 				gimple_assign_lhs (binlhs),
 				gimple_assign_rhs2 (binrhs));
@@ -4909,7 +4909,7 @@ negate_value (tree tonegate, gimple_stmt_iterator *gsip)
       rhs2 = negate_value (rhs2, &gsi);
 
       gsi = gsi_for_stmt (negatedefstmt);
-      lhs = make_ssa_name (TREE_TYPE (lhs));
+      lhs = copy_ssa_name (lhs);
       gimple_set_visited (negatedefstmt, true);
       g = gimple_build_assign (lhs, PLUS_EXPR, rhs1, rhs2);
       gimple_set_uid (g, gimple_uid (negatedefstmt));
@@ -5245,7 +5245,7 @@ repropagate_negates (void)
 	      tree b = gimple_assign_rhs2 (user);
 	      gimple_stmt_iterator gsi = gsi_for_stmt (feed);
 	      gimple_stmt_iterator gsi2 = gsi_for_stmt (user);
-	      tree x = make_ssa_name (TREE_TYPE (gimple_assign_lhs (feed)));
+	      tree x = copy_ssa_name (gimple_assign_lhs (feed));
 	      gimple *g = gimple_build_assign (x, PLUS_EXPR, a, b);
 	      gsi_insert_before (&gsi2, g, GSI_SAME_STMT);
 	      gimple_assign_set_rhs_with_ops (&gsi2, NEGATE_EXPR, x);
@@ -5737,7 +5737,7 @@ attempt_builtin_copysign (vec<operand_entry *> *ops)
 		      else
 			new_call = gimple_build_call
 			  (gimple_call_fndecl (old_call), 2, mul, arg1);
-		      tree lhs = make_ssa_name (type);
+		      tree lhs = copy_ssa_name (oe->op);
 		      gimple_call_set_lhs (new_call, lhs);
 		      gimple_set_location (new_call,
 					   gimple_location (old_call));
@@ -5748,7 +5748,7 @@ attempt_builtin_copysign (vec<operand_entry *> *ops)
 		      /* Handle the CST1 < 0 case by negating the result.  */
 		      if (cst1_neg)
 			{
-			  tree negrhs = make_ssa_name (TREE_TYPE (lhs));
+			  tree negrhs = copy_ssa_name (lhs);
 			  gimple *negate_stmt
 			    = gimple_build_assign (negrhs, NEGATE_EXPR, lhs);
 			  insert_stmt_after (negate_stmt, new_call);
@@ -6052,7 +6052,7 @@ reassociate_bb (basic_block bb)
 	      if (negate_result)
 		{
 		  stmt = SSA_NAME_DEF_STMT (lhs);
-		  tree tmp = make_ssa_name (TREE_TYPE (lhs));
+		  tree tmp = copy_ssa_name (lhs);
 		  gimple_set_lhs (stmt, tmp);
 		  if (lhs != new_lhs)
 		    tmp = new_lhs;



More information about the Gcc-patches mailing list