__attribute__((noinline, noclone)) void foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float *z3) { int i; for (i = 0; i < 32; i++) { z1[i] = -y1 * x[0][i]; z2[i] = -y2 * x[1][i]; z3[i] = -y3 * x[2][i]; } } float x[6][32] __attribute__((aligned (32))); int main () { int i; for (i = 0; i < 32; i++) { x[0][i] = i; x[1][i] = 7 * i; x[2][i] = -5.5 * i; } for (i = 0; i < 100000000; i++) foo (&x[0], 12.5, 0.5, -1.5, &x[3][0], &x[4][0], &x[5][0]); return 0; } isn't vectorized on x86_64-linux with -O3 -mavx, because there are too many versioning checks for alias. We vectorize it only with --param vect-max-version-for-alias-checks=12 . But I don't see why we'd need to emit that many checks for versioning, instead of the 12 checks for aliasing we emit we could emit just 6 (keep the 3 overlap checks in between z1, z2 and z3 and just merge each of the zN vs. &x[0][0], zN vs. &x[1][0] and zN vs. &x[2][0] tests into one comparing zN[0] though zN[31] range with &x[0][0] through &x[2][31]. Similarly, if we wanted to do a runtime check for alignment (not the case on x86_64 apparently), we could test only alignment of &x[0][0], because it is provably the same alignment as &x[1][0] and &x[2][0].
There is a dup for this bug. The whole alias test construction machinery needs to be re-written to support merging tests for adjacent DRs.
I have made a patch on this issue. However, I don't think the example here is proper. Say z1 == &(x[0][4]) (assume VF=4). Then after unrolling the loop for 4 times, there is still no data dependence that prevents vectorization. I think a better example is like the one shown below: __attribute__((noinline, noclone)) void foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float *z3) { int i; for (i = 0; i < 16; i++) { z1[i] = -y1 * x[0][i*2]; z2[i] = -y2 * x[1][i*2]; z3[i] = -y3 * x[2][i*2]; } } Here we have to make sure z1/z2/z3 does not alias with x across the whole range being traversed. Then we could merge the alias checks between z1 and &x[0][0:32]/&x[1][0:32]/&x[2][0:32] into one.
Author: congh Date: Thu Nov 7 19:29:45 2013 New Revision: 204538 URL: http://gcc.gnu.org/viewcvs?rev=204538&root=gcc&view=rev Log: 2013-11-07 Cong Hou <congh@google.com> PR tree-optimization/56764 * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine alias checks if it is possible to amortize the runtime overhead. Return the number of alias checks after merging. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Use the function vect_create_cond_for_alias_checks () to check the number of alias checks. 2013-11-07 Cong Hou <congh@google.com> * gcc.dg/vect/vect-alias-check.c: New. Added: trunk/gcc/testsuite/gcc.dg/vect/vect-alias-check.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-data-refs.c trunk/gcc/tree-vect-loop-manip.c trunk/gcc/tree-vectorizer.h
Author: congh Date: Fri Nov 8 02:08:05 2013 New Revision: 204557 URL: http://gcc.gnu.org/viewcvs?rev=204557&root=gcc&view=rev Log: 2013-11-07 Cong Hou <congh@google.com> Backport from mainline 2013-11-07 Cong Hou <congh@google.com> PR tree-optimization/56764 * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine alias checks if it is possible to amortize the runtime overhead. Return the number of alias checks after merging. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Use the function vect_create_cond_for_alias_checks () to check the number of alias checks. 2013-11-07 Cong Hou <congh@google.com> Backport from mainline 2013-11-07 Cong Hou <congh@google.com> * gcc.dg/vect/vect-alias-check.c: New. Added: branches/google/gcc-4_8/gcc/testsuite/gcc.dg/vect/vect-alias-check.c Modified: branches/google/gcc-4_8/gcc/ChangeLog branches/google/gcc-4_8/gcc/testsuite/ChangeLog branches/google/gcc-4_8/gcc/tree-vect-data-refs.c branches/google/gcc-4_8/gcc/tree-vect-loop-manip.c branches/google/gcc-4_8/gcc/tree-vectorizer.h
Author: congh Date: Thu Nov 14 21:51:07 2013 New Revision: 204825 URL: http://gcc.gnu.org/viewcvs?rev=204825&root=gcc&view=rev Log: 2013-11-14 Cong Hou <congh@google.com> Backport from mainline 2013-11-14 Cong Hou <congh@google.com> * tree-vectorizer.h (struct dr_with_seg_len): Remove the base address field as it can be obtained from dr. Rename the struct. * tree-vect-data-refs.c (comp_dr_with_seg_len_pair): Consider steps of data references during sort. (vect_prune_runtime_alias_test_list): Adjust with the change to struct dr_with_seg_len. * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Adjust with the change to struct dr_with_seg_len. 2013-11-07 Cong Hou <congh@google.com> PR tree-optimization/56764 * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine alias checks if it is possible to amortize the runtime overhead. Return the number of alias checks after merging. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Use the function vect_create_cond_for_alias_checks () to check the number of alias checks. Modified: branches/google/gcc-4_8/gcc/ChangeLog branches/google/gcc-4_8/gcc/tree-vect-data-refs.c branches/google/gcc-4_8/gcc/tree-vect-loop-manip.c branches/google/gcc-4_8/gcc/tree-vectorizer.h
This vectorizes now on x86_64 and aarch64 at -O3.