Loop unroll fixes
David Edelsohn
dje@watson.ibm.com
Fri Sep 14 14:21:00 GMT 2001
Now the unroll.c changes, starting with the bottom hunk which is
easier to explain.
emit_cmp_and_jump_insns() incorrectly is called with unsignedp
uniformly set to zero, regardless of comparison arguments.
Just above, when termination is because of overflow/underflow, the
test whether the loop can be skipped depends on the original comparison,
not the increment direction. The comparison is reversed for the
pathalogical case.
Consider
for (i = 0; --i < 6;)
It is true from the beginning. The termination is underflow of the
iterator becoming INT_MAX. The comparison between the initial and final
values if (0 >= 6) because we are testing the pathalogical condition.
Finally, the bulk of the patch, the first hunk, adds support for
the pathological case currently handled improperly. If termination is
overflow/underflow, the difference is the distance from overflow/underflow
(depending on the increment direction). Because precondition_loop_p()
already ensured that the increment is a power of 2, we don't need to worry
about the exact value of the pathalogical difference: the next computation
calculates the difference modulo the increment.
David
Index: unroll.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/unroll.c,v
retrieving revision 1.125.4.1
diff -u -r1.125.4.1 unroll.c
--- unroll.c 2001/04/21 18:43:40 1.125.4.1
+++ unroll.c 2001/07/16 21:09:42
@@ -900,6 +900,9 @@
register rtx diff;
rtx *labels;
int abs_inc, neg_inc;
+ enum rtx_code cc = loop_info->comparison_code;
+ int less_p = (cc == LE || cc == LEU || cc == LT || cc == LTU);
+ int unsigned_p = (cc == LEU || cc == GEU || cc == LTU || cc == GTU);
map->reg_map = (rtx *) xmalloc (maxregnum * sizeof (rtx));
@@ -934,9 +937,23 @@
We must copy the final and initial values here to avoid
improperly shared rtl. */
- diff = expand_binop (mode, sub_optab, copy_rtx (final_value),
- copy_rtx (initial_value), NULL_RTX, 0,
- OPTAB_LIB_WIDEN);
+ /* We have to deal with for (i = 0; --i < 6;) type loops.
+ For such loops the real final value is the first time the
+ loop variable overflows, so the diff we calculate is the
+ distance from the overflow value. This is 0 or ~0 for
+ unsigned loops depending on the direction, or INT_MAX,
+ INT_MAX+1 for signed loops. We really do not need the
+ exact value, since we are only interested in the diff
+ modulo the increment, and the increment is a power of 2,
+ so we can pretent that the overflow value is 0/~0 */
+
+ if (cc == NE || less_p != neg_inc)
+ diff = expand_binop (mode, sub_optab, copy_rtx (final_value),
+ copy_rtx (initial_value), NULL_RTX, 0,
+ OPTAB_LIB_WIDEN);
+ else
+ diff = expand_unop (mode, neg_inc ? one_cmpl_optab : neg_optab,
+ copy_rtx (initial_value), NULL_RTX, 0);
/* Now calculate (diff % (unroll * abs (increment))) by using an
and instruction. */
@@ -957,11 +974,11 @@
case. This check does not apply if the loop has a NE
comparison at the end. */
- if (loop_info->comparison_code != NE)
+ if (cc != NE)
{
emit_cmp_and_jump_insns (initial_value, final_value,
- neg_inc ? LE : GE,
- NULL_RTX, mode, 0, 0, labels[1]);
+ less_p ? GE : LE, NULL_RTX,
+ mode, unsigned_p, 0, labels[1]);
JUMP_LABEL (get_last_insn ()) = labels[1];
LABEL_NUSES (labels[1])++;
}
More information about the Gcc-patches
mailing list