Bug 66558 - Missed vectorization of loop with control flow
Summary: Missed vectorization of loop with control flow
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: ---
Assignee: Alan Hayward
URL:
Keywords:
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2015-06-16 15:39 UTC by Alan Lawrence
Modified: 2015-12-01 17:13 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Lawrence 2015-06-16 15:39:12 UTC
ICC manages to vectorize the following loop, variants of which appear in several benchmarks:

#define N 256
int a[N];
 
int
find_last (int threshold)
{
   int last = -1;
 
   for (int i = 0; i < N; i++)
    if (a[i] > threshold)
      last = i;
 
   return last;
}
Comment 1 Alan Lawrence 2015-06-16 15:49:38 UTC
Strategy could be similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013 except finding the last bit rather than the first (and no jump out of the loop).

That is, in the loop body:

  v_pred = (a[i] > threshold) for each element
  if (any element of v_pred set)
    v_save_pred = v_pred
    v_save_i = {i, i+1, i+2, i+3}
    v_last = v_save_i // or a different expression, as is assigned to 'last'

and in the epilogue,

  last = v_last[ rightmost set element in v_save_pred ]

where the rightmost set element could be done via narrow/trunc and 'bsr' (on x86), or more generally,

  idx = reduc_max_expr (v_save_pred ? v_save_i : 0)
  // any reduction will do here, as only one element will be non-zero:
  last = reduc_max_expr (v_save_i == idx ? v_last : 0)
  // or alternatively:
  last = v_last[ idx & (vec_num_elts - 1) ]
Comment 2 Alan Lawrence 2015-06-16 15:53:07 UTC
This generalizes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65947, but vectorizing the predicate as a reduction is not sufficient here.
Comment 3 alahay01 2015-11-13 10:52:06 UTC
Author: alahay01
Date: Fri Nov 13 10:51:34 2015
New Revision: 230297

URL: https://gcc.gnu.org/viewcvs?rev=230297&root=gcc&view=rev
Log:
Optimize condition reductions where the result is an integer induction variable

2015-11-13  Alan Hayward <alan.hayward@arm.com>

gcc/
	PR tree-optimization/66558
	* tree-vect-loop.c (is_integer_induction):Add.
	(vectorizable_reduction): Add integer induction checks.

gcc/testsuite/
	PR tree-optimization/66558
	* gcc.dg/vect/pr65947-1.c: Add checks.
	* gcc.dg/vect/pr65947-2.c: Add checks.
	* gcc.dg/vect/pr65947-3.c: Add checks.
	* gcc.dg/vect/pr65947-4.c: Add checks.
	* gcc.dg/vect/pr65947-5.c: Add checks.
	* gcc.dg/vect/pr65947-6.c: Add checks.
	* gcc.dg/vect/pr65947-10.c: Add checks.
	* gcc.dg/vect/pr65947-12.c: New test.
	* gcc.dg/vect/pr65947-13.c: New test.


Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-12.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-13.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-1.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-10.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-2.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-3.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-4.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-5.c
    trunk/gcc/testsuite/gcc.dg/vect/pr65947-6.c
    trunk/gcc/tree-vect-loop.c
    trunk/gcc/tree-vectorizer.h
Comment 4 rsandifo@gcc.gnu.org 2015-12-01 17:13:26 UTC
Patch applied.