[patch] Fix loop bound comparison in the vectorizer

Ira Rosen IRAR@il.ibm.com
Tue Sep 11 12:06:00 GMT 2007


This part of the x86 vect cost model patch

2007-09-10  Harsha Jagasia <harsha.jagasia@amd.com>
            Jan Sjodin <jan.sjodin@amd.com>

        * tree-vect-analyze.c (vect_analyze_operations): Change
        comparison of loop iterations with threshold to less than
        or equal to instead of less than. Reduce
        min_scalar_loop_bound by one.

makes the threshold negative in the default case where
PARAM_MIN_VECT_LOOP_BOUND is 0.

  min_scalar_loop_bound = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)
                            * vectorization_factor) - 1);
...
  th = (unsigned) min_scalar_loop_bound;
...

and this makes the following condition always true (at least on
x86_64-linux):

  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
      && LOOP_VINFO_INT_NITERS (loop_vinfo) <= th)
    {
      if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
        fprintf (vect_dump, "not vectorized: vectorization not "
                 "profitable.");
...
      return false;
    }

and no loop can get vectorized now.

I suggest to change the default and the minimum value of
PARAM_MIN_VECT_LOOP_BOUND to 1. In addition to fixing the above problem, it
seems reasonable to vectorize loops with at least one iteration.

Bootstrapping with vectorization enabled and testing on x86_64-linux . O.K.
for mainline once the testing completes?

Thanks,
Ira

ChangeLog:

      * params.def (PARAM_MIN_VECT_LOOP_BOUND): Change default and minimum
to 1.

Index: params.def
===================================================================
--- params.def  (revision 128363)
+++ params.def  (working copy)
@@ -148,7 +148,7 @@ DEFPARAM (PARAM_MAX_VARIABLE_EXPANSIONS,
 DEFPARAM (PARAM_MIN_VECT_LOOP_BOUND,
          "min-vect-loop-bound",
          "If -ftree-vectorize is used, the minimal loop bound of a loop to
be considered for vectorization",
-         0, 0, 0)
+         1, 1, 0)

 /* The maximum number of instructions to consider when looking for an
    instruction to fill a delay slot.  If more than this arbitrary



More information about the Gcc-patches mailing list