Loop unroll fixes

Fri Sep 14 14:21:00 GMT 2001

	Now the unroll.c changes, starting with the bottom hunk which is
easier to explain.

	emit_cmp_and_jump_insns() incorrectly is called with unsignedp
uniformly set to zero, regardless of comparison arguments.

	Just above, when termination is because of overflow/underflow, the
test whether the loop can be skipped depends on the original comparison,
not the increment direction.  The comparison is reversed for the
pathalogical case.

Consider

for (i = 0; --i < 6;)

It is true from the beginning.  The termination is underflow of the
iterator becoming INT_MAX.  The comparison between the initial and final
values if (0 >= 6) because we are testing the pathalogical condition.

	Finally, the bulk of the patch, the first hunk, adds support for
the pathological case currently handled improperly.  If termination is
overflow/underflow, the difference is the distance from overflow/underflow
(depending on the increment direction).  Because precondition_loop_p()
already ensured that the increment is a power of 2, we don't need to worry
about the exact value of the pathalogical difference: the next computation
calculates the difference modulo the increment.

David

 Index: unroll.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/unroll.c,v
 retrieving revision 1.125.4.1
 diff -u -r1.125.4.1 unroll.c
 --- unroll.c   2001/04/21 18:43:40     1.125.4.1
 +++ unroll.c   2001/07/16 21:09:42
 @@ -900,6 +900,9 @@
           register rtx diff;
           rtx *labels;
           int abs_inc, neg_inc;
 +        enum rtx_code cc = loop_info->comparison_code;
 +        int less_p     = (cc == LE  || cc == LEU || cc == LT  || cc == LTU);
 +        int unsigned_p = (cc == LEU || cc == GEU || cc == LTU || cc == GTU);

           map->reg_map = (rtx *) xmalloc (maxregnum * sizeof (rtx));

 @@ -934,9 +937,23 @@
              We must copy the final and initial values here to avoid
              improperly shared rtl.  */

 -        diff = expand_binop (mode, sub_optab, copy_rtx (final_value),
 -                             copy_rtx (initial_value), NULL_RTX, 0,
 -                             OPTAB_LIB_WIDEN);
 +        /* We have to deal with for (i = 0; --i < 6;) type loops.
 +           For such loops the real final value is the first time the
 +           loop variable overflows, so the diff we calculate is the
 +           distance from the overflow value.  This is 0 or ~0 for
 +           unsigned loops depending on the direction, or INT_MAX,
 +           INT_MAX+1 for signed loops.  We really do not need the
 +           exact value, since we are only interested in the diff
 +           modulo the increment, and the increment is a power of 2,
 +           so we can pretent that the overflow value is 0/~0 */
 +
 +        if (cc == NE || less_p != neg_inc)
 +          diff = expand_binop (mode, sub_optab, copy_rtx (final_value),
 +                               copy_rtx (initial_value), NULL_RTX, 0,
 +                               OPTAB_LIB_WIDEN);
 +        else
 +          diff = expand_unop (mode, neg_inc ? one_cmpl_optab : neg_optab,
 +                              copy_rtx (initial_value), NULL_RTX, 0);

           /* Now calculate (diff % (unroll * abs (increment))) by using an
              and instruction.  */
 @@ -957,11 +974,11 @@
              case.  This check does not apply if the loop has a NE
              comparison at the end.  */

 -        if (loop_info->comparison_code != NE)
 +        if (cc != NE)
             {
               emit_cmp_and_jump_insns (initial_value, final_value,
 -                                     neg_inc ? LE : GE,
 -                                     NULL_RTX, mode, 0, 0, labels[1]);
 +                                     less_p ? GE : LE, NULL_RTX,
 +                                     mode, unsigned_p, 0, labels[1]);
               JUMP_LABEL (get_last_insn ()) = labels[1];
               LABEL_NUSES (labels[1])++;
             }