This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[3.4 PATCH] Fix incorrect loop iteration count
- From: Ulrich Weigand <uweigand at de dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 11 May 2005 03:22:29 +0200 (CEST)
- Subject: [3.4 PATCH] Fix incorrect loop iteration count
Hello,
this fixes a bug in GCC 3.4 loop optimization. The number of
iterations for the following loop is computed wrongly:
for (bottom = data, top = data + 36;
top > bottom;
bottom++, top--)
i++;
The RTL for this before the loop optimizer is:
(note 15 59 29 NOTE_INSN_LOOP_BEG)
(code_label 29 15 51 1 5 "" [1 uses])
(note 51 29 23 1 [bb 1] NOTE_INSN_BASIC_BLOCK)
(insn 23 51 25 1 (parallel [
(set (reg/v:SI 44 [ i ])
(plus:SI (reg/v:SI 44 [ i ])
(const_int 1 [0x1])))
(clobber (reg:CC 33 %cc))
]) 146 {addsi3} (nil)
(nil))
(note 25 23 27 1 NOTE_INSN_LOOP_CONT)
(insn 27 25 28 1 (parallel [
(set (reg/v/f:SI 43 [ bottom ])
(plus:SI (reg/v/f:SI 43 [ bottom ])
(const_int 1 [0x1])))
(clobber (reg:CC 33 %cc))
]) 146 {addsi3} (nil)
(nil))
(insn 28 27 64 1 (parallel [
(set (reg/v/f:SI 42 [ top ])
(plus:SI (reg/v/f:SI 42 [ top ])
(const_int -1 [0xffffffffffffffff])))
(clobber (reg:CC 33 %cc))
]) 146 {addsi3} (nil)
(nil))
(note 64 28 17 1 NOTE_INSN_LOOP_VTOP)
(insn 17 64 18 1 (set (reg:CCU 33 %cc)
(compare:CCU (reg/v/f:SI 42 [ top ])
(reg/v/f:SI 43 [ bottom ]))) 33 {*cmpsi_ccu} (nil)
(nil))
(jump_insn 18 17 34 1 (set (pc)
(if_then_else (gtu (reg:CCU 33 %cc)
(const_int 0 [0x0]))
(label_ref 29)
(pc))) 289 {cjump} (nil)
(nil))
The loop_iterations routine (unroll.c) determines the following
loop properties:
initial_value: (reg/v/f:SI 42 [ top ])
comparison_value: (reg/v/f:SI 43 [ bottom ])
final_value: (reg/v/f:SI 43 [ bottom ])
iteration_var: (reg/v/f:SI 42 [ top ])
The code section preceded by this comment:
/* Try to determine the iteration count for loops such
as (for i = init; i < init + const; i++). When running the
loop optimization twice, the first pass often converts simple
loops into this form. */
then calls 'loop_find_equiv_value' on the initial value and
finds out that immediately before the loop, 'top' was equal
to 'bottom + 36', which is certainly true. It then calls
'find_common_reg_term' which determines that 'bottom + 36'
is related to 'bottom', and thus replaces the initial value
by 'bottom + 36'.
This is arguably still correct; for related discussion see:
http://gcc.gnu.org/ml/gcc-patches/2003-07/msg01926.html
However, subsequent code preceded by this comment:
/* If have initial_value = reg + const1 and final_value = reg +
const2, then replace initial_value with const1 and final_value
with const2. This should be safe since we are protected by the
initial comparison before entering the loop if we have a vtop.
For example, a + b < a + c is not equivalent to b < c for all a
when using modulo arithmetic.
now noticed that 'bottom' is a common term between the initial
and final values, and subtracts it from both. This is certainly
incorrect as 'bottom' is not loop invariant.
The following patch fixes this by performing this transformation
only if the common term is loop invariant.
Bootstrapped/regtested on s390-ibm-linux and s390x-ibm-linux
on the 3.4 branch; OK to apply to the branch?
The patch is no longer applicable to 4.0 or mainline since the
offending code was completely removed by:
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg01846.html
Thanks to Oliver Paukstadt for the test case.
Bye,
Ulrich
ChangeLog:
* unroll.c (loop_iterations): Remove common term from initial
and final value only if it is loop invariant.
Index: gcc/unroll.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/unroll.c,v
retrieving revision 1.205
diff -c -p -r1.205 unroll.c
*** gcc/unroll.c 1 Dec 2003 21:57:07 -0000 1.205
--- gcc/unroll.c 10 May 2005 20:05:16 -0000
*************** loop_iterations (struct loop *loop)
*** 3691,3697 ****
??? Without a vtop we could still perform the optimization if we check
the initial and final values carefully. */
if (loop->vtop
! && (reg_term = find_common_reg_term (initial_value, final_value)))
{
initial_value = subtract_reg_term (initial_value, reg_term);
final_value = subtract_reg_term (final_value, reg_term);
--- 3691,3698 ----
??? Without a vtop we could still perform the optimization if we check
the initial and final values carefully. */
if (loop->vtop
! && (reg_term = find_common_reg_term (initial_value, final_value))
! && loop_invariant_p (loop, reg_term))
{
initial_value = subtract_reg_term (initial_value, reg_term);
final_value = subtract_reg_term (final_value, reg_term);
*** /dev/null Tue Oct 26 21:02:43 2004
--- gcc/testsuite/gcc.dg/20050510-1.c Tue May 10 22:04:03 2005
***************
*** 0 ****
--- 1,31 ----
+ /* This used to abort due to incorrect loop iteration count computation. */
+
+ /* { dg-do run } */
+ /* { dg-options "-O2" } */
+
+ int test (unsigned char *data)
+ {
+ unsigned char *top;
+ unsigned char *bottom;
+ unsigned int i = 0;
+
+ for (bottom = data, top = data + 36;
+ top > bottom;
+ bottom++, top--)
+ {
+ i++;
+ }
+
+ return i;
+ }
+
+ int main (void)
+ {
+ unsigned char buffer[36];
+
+ if (test (buffer) != 18)
+ abort ();
+
+ exit (0);
+ }
+
--
Dr. Ulrich Weigand
Linux on zSeries Development
Ulrich.Weigand@de.ibm.com