This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/66558] Missed vectorization of loop with control flow
- From: "alalaw01 at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 16 Jun 2015 15:49:38 +0000
- Subject: [Bug tree-optimization/66558] Missed vectorization of loop with control flow
- Auto-submitted: auto-generated
- References: <bug-66558-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66558
--- Comment #1 from alalaw01 at gcc dot gnu.org ---
Strategy could be similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
except finding the last bit rather than the first (and no jump out of the
loop).
That is, in the loop body:
v_pred = (a[i] > threshold) for each element
if (any element of v_pred set)
v_save_pred = v_pred
v_save_i = {i, i+1, i+2, i+3}
v_last = v_save_i // or a different expression, as is assigned to 'last'
and in the epilogue,
last = v_last[ rightmost set element in v_save_pred ]
where the rightmost set element could be done via narrow/trunc and 'bsr' (on
x86), or more generally,
idx = reduc_max_expr (v_save_pred ? v_save_i : 0)
// any reduction will do here, as only one element will be non-zero:
last = reduc_max_expr (v_save_i == idx ? v_last : 0)
// or alternatively:
last = v_last[ idx & (vec_num_elts - 1) ]