This testcase: int a[32]; int main(int argc, char **argv) { int res = 3; for (int i = 0; i < 32; i++) if (a[i]) res = 7; return res; } does not vectorize at -O3 on x86_64 or aarch64. tree-if-conversion succeeds, giving a loop of form: <bb 3>: # res_10 = PHI <res_1(4), 3(2)> # i_11 = PHI <i_6(4), 0(2)> # ivtmp_9 = PHI <ivtmp_2(4), 32(2)> _5 = a[i_11]; res_1 = _5 != 0 ? 7 : res_10; i_6 = i_11 + 1; ivtmp_2 = ivtmp_9 - 1; if (ivtmp_2 != 0) goto <bb 4>; else goto <bb 5>; <bb 4>: goto <bb 3>; but -fdump-tree-vect-details shows: test.c:9:3: note: Analyze phi: res_10 = PHI <res_1(4), 3(2)> test.c:9:3: note: reduction: not commutative/associative: res_1 = _5 != 0 ? 7 : res_10; test.c:9:3: note: Unknown def-use cycle pattern. ... test.c:9:3: note: vect_is_simple_use: operand res_10 test.c:9:3: note: def_stmt: res_10 = PHI <res_1(4), 3(2)> test.c:9:3: note: Unsupported pattern. test.c:9:3: note: not vectorized: unsupported use in stmt. test.c:9:3: note: unexpected pattern.
Of course, the conditional assignment _is_ commutative and associative (wrt reordering iterations).
Can you please explain how the conditional is commutative. That will be very helpful. Thanks,
Yeah, you're right, it's not commutative, but then, it doesn't need to be. If f(x,y) is "(a[x] ? 7 : y)", then f(0, f(1, ...)) = f(1, f(0, ...)) (associative but not commutative), which is all we need to reorder the iterations of the loop? So if at the end of the loop we have a vector v_tmp_result = { f(8, f(4, f(0, <init>))), f(9, f(5, f(1, <init>))), f(10, f(6, f(2, <init>))), f(11, f(7, f(3, <init>))) } obtained by standard technique for reductions, we then need to reduce the vector to a scalar, which could be (a) if any of the vector elements are equal to the constant 7, then return the constant 7, else the initial value: cond_expr (vec_reduc_or (vec_equals (v_tmp_result, 7)), 7, <init>) indeed you might just vectorize to get the predicates v_tmp2 = { a[8] | a[4] | a[0], a[9] | a[5] | a[1], a[10] | a[6] | a[2], a[11] | a[7] | a[3] } and then reduce to scalar with cond_expr (vec_reduc_or (v_tmp2), 7, 3) (b) alternatively one could exploit the initial value (3) also being a constant and choose an appropriate operator from {max, min, or, and}, e.g. for 3 and 7 either reduc_max_expr(3,7) or reduc_or_expr(3,7) would work.
You definitely need special support for COND_EXPR a reduction operator. And yes, if it's in that simple form then reducing the condition is the thing to do. But then you have more complex reduction operators (generally an issue - we don't support those very well), like if (a[i]) res += x; which AFAICR is quite common as well.
Author: alahay01 Date: Fri Oct 23 12:40:33 2015 New Revision: 229245 URL: https://gcc.gnu.org/viewcvs?rev=229245&root=gcc&view=rev Log: Support for vectorizing conditional expressions 2015-10-23 Alan Hayward <alan.hayward@arm.com> gcc/ PR tree-optimization/65947 * tree-vect-loop.c (vect_is_simple_reduction_1): Find condition reductions. (vect_model_reduction_cost): Add condition reduction costs. (get_initial_def_for_reduction): Add condition reduction initial var. (vect_create_epilog_for_reduction): Add condition reduction epilog. (vectorizable_reduction): Condition reduction support. * tree-vect-stmts.c (vectorizable_condition): Add vect reduction arg * doc/sourcebuild.texi (Vector-specific attributes): Document vect_max_reduc gcc/testsuite PR tree-optimization/65947 * lib/target-supports.exp (check_effective_target_vect_max_reduc): Add. * gcc.dg/vect/pr65947-1.c: New test. * gcc.dg/vect/pr65947-2.c: New test. * gcc.dg/vect/pr65947-3.c: New test. * gcc.dg/vect/pr65947-4.c: New test. * gcc.dg/vect/pr65947-5.c: New test. * gcc.dg/vect/pr65947-6.c: New test. * gcc.dg/vect/pr65947-7.c: New test. * gcc.dg/vect/pr65947-8.c: New test. * gcc.dg/vect/pr65947-9.c: New test. * gcc.dg/vect/pr65947-10.c: New test. * gcc.dg/vect/pr65947-11.c: New test. Added: trunk/gcc/testsuite/gcc.dg/vect/pr65947-1.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-10.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-11.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-2.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-3.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-4.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-5.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-6.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-7.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-8.c trunk/gcc/testsuite/gcc.dg/vect/pr65947-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/doc/sourcebuild.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/target-supports.exp trunk/gcc/tree-vect-loop.c trunk/gcc/tree-vect-stmts.c trunk/gcc/tree-vectorizer.h
FAIL: gcc.dg/vect/pr65947-1.c (internal compiler error) FAIL: gcc.dg/vect/pr65947-1.c (test for excess errors) Excess errors: /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: error: bogus comparison result type vector(4) signed int _35 = _44 == _34; /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: error: the first argument of a VEC_COND_EXPR must be of a boolean vector type of the same number of elements as the result vector(4) int vector(4) signed int _36 = VEC_COND_EXPR <_35, vect_last_1.9_31, _32>; /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: internal compiler error: verify_gimple failed 0xb27adf verify_gimple_in_cfg(function*, bool) ../../gcc/tree-cfg.c:5093 0xa1d47f execute_function_todo ../../gcc/passes.c:1968 0xa1df1f execute_todo ../../gcc/passes.c:2025
(In reply to Andreas Schwab from comment #6) > FAIL: gcc.dg/vect/pr65947-1.c (internal compiler error) > FAIL: gcc.dg/vect/pr65947-1.c (test for excess errors) > Excess errors: > /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: error: > bogus comparison result type > vector(4) signed int > _35 = _44 == _34; > /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: error: the > first argument of a VEC_COND_EXPR must be of a boolean vector type of the > same number of elements as the result > vector(4) int > vector(4) signed int > _36 = VEC_COND_EXPR <_35, vect_last_1.9_31, _32>; > /opt/gcc/gcc-20151024/gcc/testsuite/gcc.dg/vect/pr65947-1.c:10:1: internal > compiler error: verify_gimple failed > 0xb27adf verify_gimple_in_cfg(function*, bool) > ../../gcc/tree-cfg.c:5093 > 0xa1d47f execute_function_todo > ../../gcc/passes.c:1968 > 0xa1df1f execute_todo > ../../gcc/passes.c:2025 VEC_COND_EXPR's have been changed to use boolean vectors for arg1. I'll submit a small patch ASAP,
Work with patch below plus the additional fix up patch for the compiler error.