[PATCH] Fix unroll-and-jam (PR tree-optimization/87360)

Jakub Jelinek jakub@redhat.com
Tue Dec 4 23:38:00 GMT 2018


Hi!

The following testcases ICE, because tree_loop_unroll_and_jam optimizes one
loop and on another one after it fails to analyze data dependencies and
returns.  The end effect of that is that neither the code at the end of the
function to do final cleanups, nor TODO_cleanup_cfg is done, and the latter
means that we don't reset some number of initialization stuff that keeps
referencing already removed stmts.

The following patch fixes that by replacing return false; with continue;,
so that it will just ignore the loop which couldn't be optimized and will
try to optimize other further loops and at the end if at least one loop is
optimized, return TODO_cleanup_cfg.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-12-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/87360
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): On failure to analyze
	data dependencies, don't return false, just continue.  Formatting
	fixes.
	(merge_loop_tree, bb_prevents_fusion_p, unroll_jam_possible_p,
	fuse_loops): Formatting fixes.

	* g++.dg/opt/pr87360.C: New test.
	* gfortran.dg/pr87360.f90: New test.

--- gcc/gimple-loop-jam.c.jj	2018-09-04 19:48:19.238613844 +0200
+++ gcc/gimple-loop-jam.c	2018-12-04 18:53:45.983850575 +0100
@@ -118,7 +118,7 @@ merge_loop_tree (struct loop *loop, stru
   for (i = 0; i < n; i++)
     {
       /* If the block was direct child of OLD loop it's now part
-         of LOOP.  If it was outside OLD, then it moved into LOOP
+	 of LOOP.  If it was outside OLD, then it moved into LOOP
 	 as well.  This avoids changing the loop father for BBs
 	 in inner loops of OLD.  */
       if (bbs[i]->loop_father == old
@@ -167,7 +167,7 @@ bb_prevents_fusion_p (basic_block bb)
        * stores or unknown side-effects prevent fusion
        * loads don't
        * computations into SSA names: these aren't problematic.  Their
-         result will be unused on the exit edges of the first N-1 copies
+	 result will be unused on the exit edges of the first N-1 copies
 	 (those aren't taken after unrolling).  If they are used on the
 	 other edge (the one leading to the outer latch block) they are
 	 loop-carried (on the outer loop) and the Nth copy of BB will
@@ -282,12 +282,12 @@ unroll_jam_possible_p (struct loop *oute
       if (!simple_iv (loop, loop, op, &iv, true))
 	return false;
       /* The inductions must be regular, loop invariant step and initial
-         value.  */
+	 value.  */
       if (!expr_invariant_in_loop_p (outer, iv.step)
 	  || !expr_invariant_in_loop_p (outer, iv.base))
 	return false;
       /* XXX With more effort we could also be able to deal with inductions
-         where the initial value is loop variant but a simple IV in the
+	 where the initial value is loop variant but a simple IV in the
 	 outer loop.  The initial value for the second body would be
 	 the original initial value plus iv.base.step.  The next value
 	 for the fused loop would be the original next value of the first
@@ -322,7 +322,7 @@ fuse_loops (struct loop *loop)
       gcc_assert (EDGE_COUNT (next->header->preds) == 1);
 
       /* The PHI nodes of the second body (single-argument now)
-         need adjustments to use the right values: either directly
+	 need adjustments to use the right values: either directly
 	 the value of the corresponding PHI in the first copy or
 	 the one leaving the first body which unrolling did for us.
 
@@ -449,13 +449,13 @@ tree_loop_unroll_and_jam (void)
       dependences.create (10);
       datarefs.create (10);
       if (!compute_data_dependences_for_loop (outer, true, &loop_nest,
-					       &datarefs, &dependences))
+					      &datarefs, &dependences))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file, "Cannot analyze data dependencies\n");
 	  free_data_refs (datarefs);
 	  free_dependence_relations (dependences);
-	  return false;
+	  continue;
 	}
       if (!datarefs.length ())
 	continue;
@@ -490,7 +490,7 @@ tree_loop_unroll_and_jam (void)
 				     &removed))
 	    {
 	      /* Couldn't get the distance vector.  For two reads that's
-	         harmless (we assume we should unroll).  For at least
+		 harmless (we assume we should unroll).  For at least
 		 one write this means we can't check the dependence direction
 		 and hence can't determine safety.  */
 
@@ -503,7 +503,7 @@ tree_loop_unroll_and_jam (void)
 	}
 
       /* We regard a user-specified minimum percentage of zero as a request
-         to ignore all profitability concerns and apply the transformation
+	 to ignore all profitability concerns and apply the transformation
 	 always.  */
       if (!PARAM_VALUE (PARAM_UNROLL_JAM_MIN_PERCENT))
 	profit_unroll = 2;
--- gcc/testsuite/g++.dg/opt/pr87360.C.jj	2018-12-04 18:50:57.236590929 +0100
+++ gcc/testsuite/g++.dg/opt/pr87360.C	2018-12-04 18:50:20.236191816 +0100
@@ -0,0 +1,27 @@
+// PR tree-optimization/87360
+// { dg-do compile { target size32plus } }
+// { dg-options "-O3 -fno-tree-dce --param unroll-jam-min-percent=0" }
+
+void abort (void);
+
+void foo (int N)
+{
+  int i, j;
+  int x[1000][1000];
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      x[i][j] = i + j + 3;
+
+  for (i = 0; i < N; i++)
+    for (j = 0; j < N; j++)
+      if (x[i][j] != i + j + 3)
+	abort ();
+}
+
+int main(void)
+{
+  foo (1000);
+
+  return 0;
+}
--- gcc/testsuite/gfortran.dg/pr87360.f90.jj	2018-12-04 18:52:34.321014295 +0100
+++ gcc/testsuite/gfortran.dg/pr87360.f90	2018-12-04 18:52:16.700300433 +0100
@@ -0,0 +1,5 @@
+! PR tree-optimization/87360
+! { dg-do compile }
+! { dg-options "-fno-tree-dce -O3 --param max-completely-peeled-insns=0" }
+
+include 'function_optimize_2.f90'

	Jakub



More information about the Gcc-patches mailing list