Bug 50435 - FAIL: gcc.dg/vect/bb-slp-25.c (-flto)? scan-tree-dump-times slp "basic block vectorized using SLP" 1
Summary: FAIL: gcc.dg/vect/bb-slp-25.c (-flto)? scan-tree-dump-times slp "basic block ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: testsuite (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-16 14:31 UTC by Dominique d'Humieres
Modified: 2012-01-12 17:06 UTC (History)
2 users (show)

See Also:
Host: x86_64-apple-darwin10
Target: x86_64-apple-darwin10
Build: x86_64-apple-darwin10
Known to work:
Known to fail:
Last reconfirmed:


Attachments
slp dump attached (5.18 KB, text/plain)
2011-09-16 17:37 UTC, Dominique d'Humieres
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dominique d'Humieres 2011-09-16 14:31:17 UTC
On x86_64-apple-darwin10 the test gcc.dg/vect/bb-slp-25.c fails (see http://gcc.gnu.org/ml/gcc-testresults/2011-09/msg01560.html ). Looking for SLP in bb-slp-25.c.115t.slp I get

189: Failed to SLP the basic block.
189: not vectorized: failed to find SLP opportunities in basic block.
18: Failed to SLP the basic block.
18: not vectorized: failed to find SLP opportunities in basic block.
41: Failed to SLP the basic block.
41: not vectorized: failed to find SLP opportunities in basic block.
48: Failed to SLP the basic block.
48: not vectorized: failed to find SLP opportunities in basic block.

indeed no "basic block vectorized using SLP". However compiling the test with -ftree-vectorizer-verbose=2 returns

...
Vectorizing loop at /opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:16

16: created 2 versioning for alias checks.

16: vectorizing stmts using SLP.
16: LOOP VECTORIZED.
...

I have applied r178880 on top of r178869 on powerpc-apple-darwin9 and the tests pass while I get the above results when I run them manually.
Comment 1 Ira Rosen 2011-09-16 15:18:11 UTC
(In reply to comment #0)
> 
> indeed no "basic block vectorized using SLP". However compiling the test with
> -ftree-vectorizer-verbose=2 returns
> 
> ...
> Vectorizing loop at /opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:16
> 
> 16: created 2 versioning for alias checks.
> 
> 16: vectorizing stmts using SLP.
> 16: LOOP VECTORIZED.
> ...

I understand that the loop vectorization somehow worked, so could you please try the following patch to avoid it:

Index: bb-slp-25.c
===================================================================
--- bb-slp-25.c (revision 178880)
+++ bb-slp-25.c (working copy)
@@ -9,7 +9,7 @@

 short src[N], dst[N];

-void foo (short * __restrict dst, short * __restrict src, int h, int stride)
+void foo (short * __restrict dst, short * __restrict src, int h, int stride, int dummy)
 {
   int i;
   h /= 16;
@@ -25,6 +25,8 @@ void foo (short * __restrict dst, short
       dst[7] += A*src[7] + src[7+stride];
       dst += 8;
       src += 8;
+      if (dummy == 32)
+        abort ();
    }
 }

@@ -41,7 +43,7 @@ int main (void)
        src[i] = i;
     }

-  foo (dst, src, N, 8);
+  foo (dst, src, N, 8, 0);

   for (i = 0; i < N/2; i++)
     {


> 
> I have applied r178880 on top of r178869 on powerpc-apple-darwin9 and the tests
> pass while I get the above results when I run them manually.

For PowerPC vect_element_align is false, while

/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_element_align } } } */

so we don't expect the basic block to get vectorized.

Thanks,
Ira
Comment 2 Dominique d'Humieres 2011-09-16 15:53:07 UTC
> I understand that the loop vectorization somehow worked, so could you please
> try the following patch to avoid it:

Sorry, but after the patch I still have

Running /opt/gcc/work/gcc/testsuite/gcc.dg/vect/vect.exp ...
FAIL: gcc.dg/vect/bb-slp-25.c scan-tree-dump-times slp "basic block vectorized using SLP" 1
FAIL: gcc.dg/vect/bb-slp-25.c -flto scan-tree-dump-times slp "basic block vectorized using SLP" 1

		=== gcc Summary for unix/-m32 ===

# of expected passes		4
# of unexpected failures	2
Running target unix/-m64
...
Running /opt/gcc/work/gcc/testsuite/gcc.dg/vect/vect.exp ...
FAIL: gcc.dg/vect/bb-slp-25.c scan-tree-dump-times slp "basic block vectorized using SLP" 1
FAIL: gcc.dg/vect/bb-slp-25.c -flto scan-tree-dump-times slp "basic block vectorized using SLP" 1

		=== gcc Summary for unix/-m64 ===

# of expected passes		4
# of unexpected failures	2

		=== gcc Summary ===

# of expected passes		8
# of unexpected failures	4
/opt/gcc/build_w/gcc/xgcc  version 4.7.0 20110916 (experimental) [trunk revision 178905] (GCC) 

The loop is not vectorized:

/opt/gcc/work/gcc/testsuite/gcc.dg/vect/bb-slp-25.c:12: note: vectorized 0 loops in function.

and looking for SLP yields

189: Failed to SLP the basic block.
189: not vectorized: failed to find SLP opportunities in basic block.
43: Failed to SLP the basic block.
43: not vectorized: failed to find SLP opportunities in basic block.
50: Failed to SLP the basic block.
50: not vectorized: failed to find SLP opportunities in basic block.
Comment 3 Ira Rosen 2011-09-16 16:18:37 UTC
Well, at least the loop is not vectorized now :).
Could you please attach the slp dump (-fdump-tree-slp-details)?

Thanks,
Ira
Comment 4 Dominique d'Humieres 2011-09-16 17:37:17 UTC
Created attachment 25307 [details]
slp dump attached
Comment 5 Ira Rosen 2011-09-18 08:52:56 UTC
Thanks.

Data dependence analysis can't determine dependence between src and dst although they have _restrict_, and it works fine on x86_64-suse-linux for example... Does darwin have a known problem with restrict?

Thanks,
Ira
Comment 6 Dominique d'Humieres 2011-09-18 10:41:09 UTC
> Does darwin have a known problem with restrict?

None I am aware of. BTW what is the difference between '*__restrict__'  and '* __restrict' (or '* __restrict__')?
Comment 7 Dominique d'Humieres 2011-09-18 10:45:08 UTC
Note that the test succeeds if I replace '* __restrict' with '*__restrict__'
Comment 8 Ira Rosen 2011-09-18 10:48:43 UTC
Looks like there is a difference ;)
I guess it succeeds with the patch to avoid loop vectorization and the fix of restrict together?
Comment 9 Dominique d'Humieres 2011-09-18 10:54:14 UTC
> Looks like there is a difference ;)
> I guess it succeeds with the patch to avoid loop vectorization and the fix of restrict together?

Here is the patched test that gives no failure (i.e., yours and the change to restrict):

--- ../_clean/gcc/testsuite/gcc.dg/vect/bb-slp-25.c	2011-09-15 13:34:18.000000000 +0200
+++ gcc/testsuite/gcc.dg/vect/bb-slp-25.c	2011-09-18 12:42:21.000000000 +0200
@@ -9,7 +9,7 @@
 
 short src[N], dst[N];
 
-void foo (short * __restrict dst, short * __restrict src, int h, int stride)
+void foo (short *__restrict__ dst, short *__restrict__ src, int h, int stride, int dummy)
 {
   int i;
   h /= 16;
@@ -25,6 +25,8 @@ void foo (short * __restrict dst, short 
       dst[7] += A*src[7] + src[7+stride];
       dst += 8;
       src += 8;
+      if (dummy == 32)
+        abort ();
    }
 }
 
@@ -41,7 +43,7 @@ int main (void)
        src[i] = i;
     }
 
-  foo (dst, src, N, 8);
+  foo (dst, src, N, 8, 0);
 
   for (i = 0; i < N/2; i++)
     {
Comment 10 Ira Rosen 2011-09-18 10:55:19 UTC
Thanks, I'll commit it.
Comment 11 irar 2011-09-18 11:41:48 UTC
Author: irar
Date: Sun Sep 18 11:41:43 2011
New Revision: 178942

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178942
Log:

	PR testsuite/50435
	* gcc.dg/vect/bb-slp-25.c: Add an if to avoid loop vectorization.
	Fix underscores around restrict.


Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/bb-slp-25.c
Comment 12 Dominique d'Humieres 2011-09-18 13:11:59 UTC
> Thanks, I'll commit it.

Thanks for the quick fix. I'ld like to leave this pr open until someone figure out what's wrong with darwin and __restrict. 

Note that I have replaced all the occurrences of __restrict with __restrict__  I have found in gcc.dg/vect/* and bb-slp-25.c is the only test for which it mattered.
Comment 13 Ira Rosen 2011-09-19 08:59:44 UTC
(In reply to comment #12)
 
> Note that I have replaced all the occurrences of __restrict with __restrict__ 
> I have found in gcc.dg/vect/* and bb-slp-25.c is the only test for which it
> mattered.

It is probably just doesn't matter in other tests: we can use versioning for alias in loop vectorization.
Comment 14 Dominique d'Humieres 2012-01-12 17:06:53 UTC
Closing as fixed.