On x86_64-apple-darwin10 the test gcc.dg/vect/bb-slp-25.c fails (see http://gcc.gnu.org/ml/gcc-testresults/2011-09/msg01560.html ). Looking for SLP in bb-slp-25.c.115t.slp I get 189: Failed to SLP the basic block. 189: not vectorized: failed to find SLP opportunities in basic block. 18: Failed to SLP the basic block. 18: not vectorized: failed to find SLP opportunities in basic block. 41: Failed to SLP the basic block. 41: not vectorized: failed to find SLP opportunities in basic block. 48: Failed to SLP the basic block. 48: not vectorized: failed to find SLP opportunities in basic block. indeed no "basic block vectorized using SLP". However compiling the test with -ftree-vectorizer-verbose=2 returns ... Vectorizing loop at /opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:16 16: created 2 versioning for alias checks. 16: vectorizing stmts using SLP. 16: LOOP VECTORIZED. ... I have applied r178880 on top of r178869 on powerpc-apple-darwin9 and the tests pass while I get the above results when I run them manually.
(In reply to comment #0) > > indeed no "basic block vectorized using SLP". However compiling the test with > -ftree-vectorizer-verbose=2 returns > > ... > Vectorizing loop at /opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:16 > > 16: created 2 versioning for alias checks. > > 16: vectorizing stmts using SLP. > 16: LOOP VECTORIZED. > ... I understand that the loop vectorization somehow worked, so could you please try the following patch to avoid it: Index: bb-slp-25.c =================================================================== --- bb-slp-25.c (revision 178880) +++ bb-slp-25.c (working copy) @@ -9,7 +9,7 @@ short src[N], dst[N]; -void foo (short * __restrict dst, short * __restrict src, int h, int stride) +void foo (short * __restrict dst, short * __restrict src, int h, int stride, int dummy) { int i; h /= 16; @@ -25,6 +25,8 @@ void foo (short * __restrict dst, short dst[7] += A*src[7] + src[7+stride]; dst += 8; src += 8; + if (dummy == 32) + abort (); } } @@ -41,7 +43,7 @@ int main (void) src[i] = i; } - foo (dst, src, N, 8); + foo (dst, src, N, 8, 0); for (i = 0; i < N/2; i++) { > > I have applied r178880 on top of r178869 on powerpc-apple-darwin9 and the tests > pass while I get the above results when I run them manually. For PowerPC vect_element_align is false, while /* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_element_align } } } */ so we don't expect the basic block to get vectorized. Thanks, Ira
> I understand that the loop vectorization somehow worked, so could you please > try the following patch to avoid it: Sorry, but after the patch I still have Running /opt/gcc/work/gcc/testsuite/gcc.dg/vect/vect.exp ... FAIL: gcc.dg/vect/bb-slp-25.c scan-tree-dump-times slp "basic block vectorized using SLP" 1 FAIL: gcc.dg/vect/bb-slp-25.c -flto scan-tree-dump-times slp "basic block vectorized using SLP" 1 === gcc Summary for unix/-m32 === # of expected passes 4 # of unexpected failures 2 Running target unix/-m64 ... Running /opt/gcc/work/gcc/testsuite/gcc.dg/vect/vect.exp ... FAIL: gcc.dg/vect/bb-slp-25.c scan-tree-dump-times slp "basic block vectorized using SLP" 1 FAIL: gcc.dg/vect/bb-slp-25.c -flto scan-tree-dump-times slp "basic block vectorized using SLP" 1 === gcc Summary for unix/-m64 === # of expected passes 4 # of unexpected failures 2 === gcc Summary === # of expected passes 8 # of unexpected failures 4 /opt/gcc/build_w/gcc/xgcc version 4.7.0 20110916 (experimental) [trunk revision 178905] (GCC) The loop is not vectorized: /opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:12: note: vectorized 0 loops in function. and looking for SLP yields 189: Failed to SLP the basic block. 189: not vectorized: failed to find SLP opportunities in basic block. 43: Failed to SLP the basic block. 43: not vectorized: failed to find SLP opportunities in basic block. 50: Failed to SLP the basic block. 50: not vectorized: failed to find SLP opportunities in basic block.
Well, at least the loop is not vectorized now :). Could you please attach the slp dump (-fdump-tree-slp-details)? Thanks, Ira
Created attachment 25307 [details] slp dump attached
Thanks. Data dependence analysis can't determine dependence between src and dst although they have _restrict_, and it works fine on x86_64-suse-linux for example... Does darwin have a known problem with restrict? Thanks, Ira
> Does darwin have a known problem with restrict? None I am aware of. BTW what is the difference between '*__restrict__' and '* __restrict' (or '* __restrict__')?
Note that the test succeeds if I replace '* __restrict' with '*__restrict__'
Looks like there is a difference ;) I guess it succeeds with the patch to avoid loop vectorization and the fix of restrict together?
> Looks like there is a difference ;) > I guess it succeeds with the patch to avoid loop vectorization and the fix of restrict together? Here is the patched test that gives no failure (i.e., yours and the change to restrict): --- ../_clean/gcc/testsuite/gcc.dg/vect/bb-slp-25.c 2011-09-15 13:34:18.000000000 +0200 +++ gcc/testsuite/gcc.dg/vect/bb-slp-25.c 2011-09-18 12:42:21.000000000 +0200 @@ -9,7 +9,7 @@ short src[N], dst[N]; -void foo (short * __restrict dst, short * __restrict src, int h, int stride) +void foo (short *__restrict__ dst, short *__restrict__ src, int h, int stride, int dummy) { int i; h /= 16; @@ -25,6 +25,8 @@ void foo (short * __restrict dst, short dst[7] += A*src[7] + src[7+stride]; dst += 8; src += 8; + if (dummy == 32) + abort (); } } @@ -41,7 +43,7 @@ int main (void) src[i] = i; } - foo (dst, src, N, 8); + foo (dst, src, N, 8, 0); for (i = 0; i < N/2; i++) {
Thanks, I'll commit it.
Author: irar Date: Sun Sep 18 11:41:43 2011 New Revision: 178942 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178942 Log: PR testsuite/50435 * gcc.dg/vect/bb-slp-25.c: Add an if to avoid loop vectorization. Fix underscores around restrict. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/vect/bb-slp-25.c
> Thanks, I'll commit it. Thanks for the quick fix. I'ld like to leave this pr open until someone figure out what's wrong with darwin and __restrict. Note that I have replaced all the occurrences of __restrict with __restrict__ I have found in gcc.dg/vect/* and bb-slp-25.c is the only test for which it mattered.
(In reply to comment #12) > Note that I have replaced all the occurrences of __restrict with __restrict__ > I have found in gcc.dg/vect/* and bb-slp-25.c is the only test for which it > mattered. It is probably just doesn't matter in other tests: we can use versioning for alias in loop vectorization.
Closing as fixed.