optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops

Alan Modra amodra@bigpond.net.au
Thu Jul 11 01:10:00 GMT 2002


doloop.c:doloop_modify_runtime says:

     If the loop has been unrolled, then the loop body has been
     preconditioned to iterate a multiple of unroll_number times.  If
     abs_inc is != 1, the full calculation is

       t1 = abs_inc * unroll_number;
       n = abs (final - initial) / t1;
       n += (abs (final - initial) % t1) > t1 - abs_inc;

This is wrong.  Taking the example in the PR, we have

abs_inc = 1
unroll_number = 4
abs (final - initial) = 10

=> t1 == 4
   abs (final - initial) % t1 == 2
=> n == 2

We want n == 3, to go around the loop fully twice, and once partially.

A little thought shows the correct calculation is

	The amount we increment per (partially) unrolled loop
        t1 = abs_inc * unroll_number;

	The number of time we'll go fully round the loop.
        n = abs (final - initial) / t1;

	Plus any partial loops.
        n += (abs (final - initial) % t1) >= abs_inc;


	PR optimization/7130
	* doloop.c (doloop_modify_runtime): Correct count for unrolled loops.

This needs to go on the 3.1 branch too.  OK, assuming my powerpc-linux
bootstrap and regression test passes?

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

Index: gcc/doloop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doloop.c,v
retrieving revision 1.20
diff -u -p -r1.20 doloop.c
--- gcc/doloop.c	24 Jun 2002 02:16:42 -0000	1.20
+++ gcc/doloop.c	11 Jul 2002 07:49:43 -0000
@@ -552,6 +552,7 @@ doloop_modify_runtime (loop, iterations_
 {
   const struct loop_info *loop_info = LOOP_INFO (loop);
   HOST_WIDE_INT abs_inc;
+  HOST_WIDE_INT abs_loop_inc;
   int neg_inc;
   rtx diff;
   rtx sequence;
@@ -601,7 +602,7 @@ doloop_modify_runtime (loop, iterations_
 
        t1 = abs_inc * unroll_number;
        n = abs (final - initial) / t1;
-       n += (abs (final - initial) % t1) > t1 - abs_inc;
+       n += (abs (final - initial) % t1) >= abs_inc;
 
      The division and modulo operations can be avoided by requiring
      that the increment is a power of 2 (precondition_loop_p enforces
@@ -667,20 +668,21 @@ doloop_modify_runtime (loop, iterations_
 	}
     }
 
-  if (abs_inc * loop_info->unroll_number != 1)
+  abs_loop_inc = abs_inc * loop_info->unroll_number;
+  if (abs_loop_inc != 1)
     {
       int shift_count;
 
-      shift_count = exact_log2 (abs_inc * loop_info->unroll_number);
+      shift_count = exact_log2 (abs_loop_inc);
       if (shift_count < 0)
 	abort ();
 
-      if (abs_inc != 1)
-	diff = expand_simple_binop (GET_MODE (diff), PLUS,
-				    diff, GEN_INT (abs_inc - 1),
-				    diff, 1, OPTAB_LIB_WIDEN);
+      diff = expand_simple_binop (GET_MODE (diff), PLUS,
+				  diff, GEN_INT (abs_loop_inc - abs_inc),
+				  diff, 1, OPTAB_LIB_WIDEN);
 
-      /* (abs (final - initial) + abs_inc - 1) / (abs_inc * unroll_number)  */
+      /* (abs (final - initial) + abs_inc * unroll_number - abs_inc)
+	 / (abs_inc * unroll_number)  */
       diff = expand_simple_binop (GET_MODE (diff), LSHIFTRT,
 				  diff, GEN_INT (shift_count),
 				  diff, 1, OPTAB_LIB_WIDEN);



More information about the Gcc-patches mailing list