This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/66558] Missed vectorization of loop with control flow


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66558

--- Comment #1 from alalaw01 at gcc dot gnu.org ---
Strategy could be similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
except finding the last bit rather than the first (and no jump out of the
loop).

That is, in the loop body:

  v_pred = (a[i] > threshold) for each element
  if (any element of v_pred set)
    v_save_pred = v_pred
    v_save_i = {i, i+1, i+2, i+3}
    v_last = v_save_i // or a different expression, as is assigned to 'last'

and in the epilogue,

  last = v_last[ rightmost set element in v_save_pred ]

where the rightmost set element could be done via narrow/trunc and 'bsr' (on
x86), or more generally,

  idx = reduc_max_expr (v_save_pred ? v_save_i : 0)
  // any reduction will do here, as only one element will be non-zero:
  last = reduc_max_expr (v_save_i == idx ? v_last : 0)
  // or alternatively:
  last = v_last[ idx & (vec_num_elts - 1) ]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]