Bug 92177 - [10 Regression] gcc.dg/vect/bb-slp-22.c FAILs
Summary: [10 Regression] gcc.dg/vect/bb-slp-22.c FAILs
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P2 normal
Target Milestone: 10.2
Assignee: Richard Biener
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-22 12:35 UTC by Rainer Orth
Modified: 2020-05-13 07:20 UTC (History)
1 user (show)

See Also:
Host:
Target: sparc-sun-solaris2.11
Build:
Known to work: 11.0
Known to fail: 10.0
Last reconfirmed: 2019-10-22 00:00:00


Attachments
32-bit sparc-sun-solaris2.11 bb-slp-22.c.168t.slp2 (4.39 KB, text/plain)
2019-10-22 12:35 UTC, Rainer Orth
Details
32-bit sparc-sun-solaris2.11 bb-slp-22.c.171.slp2 (4.99 KB, text/plain)
2020-05-01 15:16 UTC, Rainer Orth
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Orth 2019-10-22 12:35:43 UTC
Created attachment 47085 [details]
32-bit sparc-sun-solaris2.11 bb-slp-22.c.168t.slp2

Between 20191016 (r277071) and 20191021 (r277261), gcc.dg/vect/bb-slp-22.c began
to FAIL on 32 and 64-bit SPARC:

+FAIL: gcc.dg/vect/bb-slp-22.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "basic block vectorized" 1
+FAIL: gcc.dg/vect/bb-slp-22.c scan-tree-dump-times slp2 "basic block vectorized" 1

The message now appears twice, so this is clearly progress, but the guards need
to be adapted.

Dump attached.
Comment 1 Richard Biener 2019-10-22 13:27:55 UTC
Ah, thanks - that might be an undesided side-effect of r277241

We now vectorize

      out[0] = a0 * x;
      out[1] = a1 * y;
      out[2] = a2 * x;
      out[3] = a3 * y;

as

  _5 = a0_40 * x_44(D);
  _6 = a1_41 * y_45(D);
  _7 = a2_42 * x_44(D);
  _8 = a3_43 * y_45(D);
  _66 = {_7, _8};
  vect_cst__67 = _66;
  _68 = {_5, _6};
  vect_cst__69 = _68;
  MEM <vector(2) unsigned int> [(unsigned int *)&out] = vect_cst__69;
  _71 = &out[0] + 8;
  MEM <vector(2) unsigned int> [(unsigned int *)_71] = vect_cst__67;

since we're no longer fencing the build-from-scalar code via
&& !SLP_TREE_CHILDREN (child).is_empty () (previously we had no
SLP children nodes for the SLP node representing the multiplication).

So the test is no longer testing vectorization of multiplications.

It also shows that BB vectorizing this function at strict basic-block
boundaries is suboptimal.

I'll see what to best do here, clearly a sort-term fix would be to
change the code to make vectorization of the multiplication more
obviously profitable.
Comment 2 CVS Commits 2020-02-05 13:13:51 UTC
The master branch has been updated by Richard Guenther <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:9847df2c9573f1e4b948b5a7272c6aadf8e01c22

commit r10-6449-g9847df2c9573f1e4b948b5a7272c6aadf8e01c22
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Feb 5 14:10:50 2020 +0100

    testsuite/92177 fix for SLP build changes
    
    We're now consistently building SLP operations with only
    scalar defs from scalars which makes the testcase no longer
    testing multiplication vectorization.  The following smuggles
    in a constant making the vector variant profitable for SLP build.
    
    2020-02-05  Richard Biener  <rguenther@suse.de>
    
    	PR testsuite/92177
    	* gcc.dg/vect/bb-slp-22.c: Adjust.
Comment 3 Richard Biener 2020-02-05 13:14:44 UTC
We should test multiplication again, so hopefully fixed (unless Richards recent improvement makes such adventure moot again by pruning the SLP tree based on unsupported ops...)
Comment 4 Rainer Orth 2020-05-01 15:16:36 UTC
Created attachment 48429 [details]
32-bit sparc-sun-solaris2.11 bb-slp-22.c.171.slp2

I just noticed that the test still FAILs, both 32 and 64-bit, and apparently
never stopped.
Comment 5 Rainer Orth 2020-05-01 15:17:01 UTC
Not yet fixed.
Comment 6 Jakub Jelinek 2020-05-04 14:14:49 UTC
Doesn't look like a release blocker though, it seems this isn't a wrong-code issue.
sparc*-*-solaris* is not a vect_int_mult target, as it can't vectorize integer multiplication, but on the testcase in question it doesn't vectorize that:
   [bb-slp-22.c:29:24] _5 = x_48(D) + 1;
   [bb-slp-22.c:29:19] _6 = _5 * a0_44;
-  [bb-slp-22.c:29:14] [bb-slp-22.c:29:10] out[0] = _6;
   [bb-slp-22.c:30:24] _7 = y_49(D) + 1;
   [bb-slp-22.c:30:19] _8 = _7 * a1_45;
-  [bb-slp-22.c:30:14] [bb-slp-22.c:30:10] out[1] = _8;
   [bb-slp-22.c:31:19] _9 = _5 * a2_46;
-  [bb-slp-22.c:31:14] [bb-slp-22.c:31:10] out[2] = _9;
   [bb-slp-22.c:32:19] _10 = _7 * a3_47;
-  [bb-slp-22.c:32:14] [bb-slp-22.c:32:10] out[3] = _10;
+  _70 = {_9, _10};
+  [bb-slp-22.c:29:14] vect_cst__71 = _70;
+  _72 = {_6, _8};
+  [bb-slp-22.c:29:14] vect_cst__73 = _72;
+  [bb-slp-22.c:29:14] MEM <vector(2) unsigned int> [(unsigned int *)&out] = vect_cst__73;
+  [bb-slp-22.c:29:14] _75 = &[bb-slp-22.c:29:10] out[0] + 8;
+  [bb-slp-22.c:29:14] MEM <vector(2) unsigned int> [(unsigned int *)_75] = vect_cst__71;

It just vectorizes the stores.  So looks like a testsuite issue to me.  Either it needs to look for a different message in the dump file etc.
Comment 7 Richard Biener 2020-05-04 16:08:41 UTC
The SLP reorgs made making this testcase target agnostic really really hard.  It was supposed to test multiplication vectorization but now tests nothing.  I don't see how to reliably test this unless we scan for details in the vectorizer
dumps that are prone to change.  Currently we have for example
on x86_64:

/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: SLPing BB part
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: ------>vectorizing SLP node starting from: _5 = x_48(D) + 1;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand x_48(D), type of def: external
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand 1, type of def: constant
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: transform binary/unary operation.
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: created new init_stmt: vect_cst__63 = _42;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: created new init_stmt: vect_cst__12 = { 1, 1, 1, 1 };
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: add new stmt: vect__5.19_13 = vect_cst__63 + vect_cst__12;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: ------>vectorizing SLP node starting from: _6 = _5 * a0_44;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand x_48(D) + 1, type of def: internal
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: vectype vector(4) unsigned int
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand _1 + 23, type of def: external
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: transform binary/unary operation.
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: created new init_stmt: vect_cst__71 = _70;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: add new stmt: vect__6.20_72 = vect__5.19_13 * vect_cst__71;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: ------>vectorizing SLP node starting from: out[0] = _6;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand _5 * a0_44, type of def: internal
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: vectype vector(4) unsigned int
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand _7 * a1_45, type of def: internal
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand _5 * a2_46, type of def: internal
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vect_is_simple_use: operand _7 * a3_47, type of def: internal
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: transform store. ncopies = 1
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: create vector_type-pointer variable to type: vector(4) unsigned int  vectorizing a pointer ref: out[0]
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: created &out[0]
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: add new stmt: MEM <vector(4) unsigned int> [(unsigned int *)&out] = vect__6.20_72;
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: note: vectorizing stmts using SLP.
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/bb-slp-22.c:35:14: optimized: basic block part vectorized using 16 byte vectors

we could match 'vectorizing SLP node starting from: _6 = _5 * a0_44;

which means we are actually vectorizing a multiplication.  Like with
the following.  Rainer - can you test this?

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
index 6dc2375f5d1..f25a225666e 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
@@ -63,6 +63,6 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2"  { target { ! {vect_int_mult } } } } } */
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2"  { target vect_int_mult  } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" } } */
+/* { dg-final { scan-tree-dump "vectorizing SLP node starting from: _\[0-9\]+ = _\[0-9\]+ \\\* a0" "slp2"  { target vect_int_mult } } } */
Comment 8 ro@CeBiTec.Uni-Bielefeld.DE 2020-05-05 13:09:44 UTC
> --- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
[...]
> which means we are actually vectorizing a multiplication.  Like with
> the following.  Rainer - can you test this?
[...]
Works for me: tested on sparc-sun-solaris2.11 (32 and 64-bit) and (to
verify that I didn't mess up the mangled patch) on i386-pc-solaris2.11.
Comment 9 CVS Commits 2020-05-05 13:41:00 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:3fbf43b9bc060e2904abe64e870868b9a4bfce13

commit r11-71-g3fbf43b9bc060e2904abe64e870868b9a4bfce13
Author: Richard Biener <rguenther@suse.de>
Date:   Tue May 5 15:38:24 2020 +0200

    testsuite/92177 - adjust expected patterns for gcc.dg/vect/bb-slp-22.c
    
    We now always vectorize two BBs, adjust the selector to also scan
    for integer multiplication vectorization explicitely.
    
    2020-05-05  Richard Biener  <rguenther@suse.de>
    
            PR testsuite/92177
            * gcc.dg/vect/bb-slp-22.c: Adjust.
Comment 10 Richard Biener 2020-05-05 13:41:25 UTC
Fixed on trunk (hopefully).
Comment 11 Jakub Jelinek 2020-05-07 11:56:25 UTC
GCC 10.1 has been released.
Comment 12 Richard Biener 2020-05-13 07:12:19 UTC
.
Comment 13 CVS Commits 2020-05-13 07:20:54 UTC
The releases/gcc-10 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:ff9d4e09566e24d4dff10adeaef109823266c7bd

commit r10-8140-gff9d4e09566e24d4dff10adeaef109823266c7bd
Author: Richard Biener <rguenther@suse.de>
Date:   Tue May 5 15:38:24 2020 +0200

    testsuite/92177 - adjust expected patterns for gcc.dg/vect/bb-slp-22.c
    
    We now always vectorize two BBs, adjust the selector to also scan
    for integer multiplication vectorization explicitely.
    
    2020-05-05  Richard Biener  <rguenther@suse.de>
    
            PR testsuite/92177
            * gcc.dg/vect/bb-slp-22.c: Adjust.