This is the mail archive of the
gcc-prs@gcc.gnu.org
mailing list for the GCC project.
Re: optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops
- From: Alan Modra <amodra at bigpond dot net dot au>
- To: nobody at gcc dot gnu dot org
- Cc: gcc-prs at gcc dot gnu dot org,
- Date: 11 Jul 2002 07:56:01 -0000
- Subject: Re: optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops
- Reply-to: Alan Modra <amodra at bigpond dot net dot au>
The following reply was made to PR optimization/7130; it has been noted by GNATS.
From: Alan Modra <amodra@bigpond.net.au>
To: yozo@cs.berkeley.edu
Cc: gcc-gnats@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops
Date: Thu, 11 Jul 2002 17:21:57 +0930
doloop.c:doloop_modify_runtime says:
If the loop has been unrolled, then the loop body has been
preconditioned to iterate a multiple of unroll_number times. If
abs_inc is != 1, the full calculation is
t1 = abs_inc * unroll_number;
n = abs (final - initial) / t1;
n += (abs (final - initial) % t1) > t1 - abs_inc;
This is wrong. Taking the example in the PR, we have
abs_inc = 1
unroll_number = 4
abs (final - initial) = 10
=> t1 == 4
abs (final - initial) % t1 == 2
=> n == 2
We want n == 3, to go around the loop fully twice, and once partially.
A little thought shows the correct calculation is
The amount we increment per (partially) unrolled loop
t1 = abs_inc * unroll_number;
The number of time we'll go fully round the loop.
n = abs (final - initial) / t1;
Plus any partial loops.
n += (abs (final - initial) % t1) >= abs_inc;
PR optimization/7130
* doloop.c (doloop_modify_runtime): Correct count for unrolled loops.
This needs to go on the 3.1 branch too. OK, assuming my powerpc-linux
bootstrap and regression test passes?
--
Alan Modra
IBM OzLabs - Linux Technology Centre
Index: gcc/doloop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doloop.c,v
retrieving revision 1.20
diff -u -p -r1.20 doloop.c
--- gcc/doloop.c 24 Jun 2002 02:16:42 -0000 1.20
+++ gcc/doloop.c 11 Jul 2002 07:49:43 -0000
@@ -552,6 +552,7 @@ doloop_modify_runtime (loop, iterations_
{
const struct loop_info *loop_info = LOOP_INFO (loop);
HOST_WIDE_INT abs_inc;
+ HOST_WIDE_INT abs_loop_inc;
int neg_inc;
rtx diff;
rtx sequence;
@@ -601,7 +602,7 @@ doloop_modify_runtime (loop, iterations_
t1 = abs_inc * unroll_number;
n = abs (final - initial) / t1;
- n += (abs (final - initial) % t1) > t1 - abs_inc;
+ n += (abs (final - initial) % t1) >= abs_inc;
The division and modulo operations can be avoided by requiring
that the increment is a power of 2 (precondition_loop_p enforces
@@ -667,20 +668,21 @@ doloop_modify_runtime (loop, iterations_
}
}
- if (abs_inc * loop_info->unroll_number != 1)
+ abs_loop_inc = abs_inc * loop_info->unroll_number;
+ if (abs_loop_inc != 1)
{
int shift_count;
- shift_count = exact_log2 (abs_inc * loop_info->unroll_number);
+ shift_count = exact_log2 (abs_loop_inc);
if (shift_count < 0)
abort ();
- if (abs_inc != 1)
- diff = expand_simple_binop (GET_MODE (diff), PLUS,
- diff, GEN_INT (abs_inc - 1),
- diff, 1, OPTAB_LIB_WIDEN);
+ diff = expand_simple_binop (GET_MODE (diff), PLUS,
+ diff, GEN_INT (abs_loop_inc - abs_inc),
+ diff, 1, OPTAB_LIB_WIDEN);
- /* (abs (final - initial) + abs_inc - 1) / (abs_inc * unroll_number) */
+ /* (abs (final - initial) + abs_inc * unroll_number - abs_inc)
+ / (abs_inc * unroll_number) */
diff = expand_simple_binop (GET_MODE (diff), LSHIFTRT,
diff, GEN_INT (shift_count),
diff, 1, OPTAB_LIB_WIDEN);