Bug 88315 - SAD and DOT_PROD SLP reductions with initial value != 0 create wrong code
Summary: SAD and DOT_PROD SLP reductions with initial value != 0 create wrong code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on: 88567
Blocks:
  Show dependency treegraph
 
Reported: 2018-12-03 15:02 UTC by Richard Biener
Modified: 2019-09-04 08:32 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work: 8.3.1, 9.0
Known to fail: 7.4.1, 8.2.0, 8.3.0
Last reconfirmed: 2018-12-03 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2018-12-03 15:02:51 UTC
Index: gcc/testsuite/gcc.dg/vect/slp-reduc-sad.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-reduc-sad.c   (revision 266739)
+++ gcc/testsuite/gcc.dg/vect/slp-reduc-sad.c   (working copy)
@@ -12,7 +12,7 @@ extern void abort (void);
 int __attribute__((noinline,noclone))
 foo (uint8_t *pix1, uint8_t *pix2, int i_stride_pix2)
 {
-  int i_sum = 0;
+  int i_sum = 5;
   for( int y = 0; y < 16; y++ )
     {
       i_sum += abs ( pix1[0] - pix2[0] );
@@ -52,7 +52,7 @@ main ()
       __asm__ volatile ("");
     }
 
-  if (foo (X, Y, 16) != 32512)
+  if (foo (X, Y, 16) != 32512 + 5)
     abort ();
 
   return 0;


FAILs at runtime.  This is because

  number_of_copies = nunits * number_of_vectors / group_size;

is zero as both SAD and DOT_PROD reduce to half the number of lanes
and thus for example nunits == 4, number_of_vectors == 1 but group_size == 8.

Looks like GCC 7, 8 and trunk are affected.
Comment 1 Richard Biener 2018-12-03 15:45:44 UTC
I have a patch.
Comment 2 Richard Biener 2018-12-04 08:24:12 UTC
Author: rguenth
Date: Tue Dec  4 08:23:40 2018
New Revision: 266771

URL: https://gcc.gnu.org/viewcvs?rev=266771&root=gcc&view=rev
Log:
2018-12-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/88315
	* tree-vect-loop.c (get_initial_defs_for_reduction): Simplify
	and fix initialization vector for SAD and DOT_PROD SLP reductions.

	* gcc.dg/vect/slp-reduc-sad.c: Adjust to provide non-trivial
	initial value.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/slp-reduc-sad.c
    trunk/gcc/tree-vect-loop.c
Comment 3 Richard Biener 2019-01-15 08:00:40 UTC
Backports should check PR88567 for adjustments.
Comment 4 Richard Biener 2019-08-30 13:19:54 UTC
Author: rguenth
Date: Fri Aug 30 13:19:23 2019
New Revision: 275168

URL: https://gcc.gnu.org/viewcvs?rev=275168&root=gcc&view=rev
Log:
2019-08-30  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2019-01-07  Richard Sandiford  <richard.sandiford@arm.com>

	PR middle-end/88567
	* tree-vect-loop.c (get_initial_defs_for_reduction): Pass the
	output vector directly to duplicate_and_interleave instead of
	going through a temporary.  Postpone insertion of ctor_seq to
	the end of the loop.

	2018-12-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/88315
	* tree-vect-loop.c (get_initial_defs_for_reduction): Simplify
	and fix initialization vector for SAD and DOT_PROD SLP reductions.

	* gcc.dg/vect/slp-reduc-sad.c: Adjust to provide non-trivial
	initial value.

Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
    branches/gcc-8-branch/gcc/testsuite/gcc.dg/vect/slp-reduc-sad.c
    branches/gcc-8-branch/gcc/tree-vect-loop.c
Comment 5 Richard Biener 2019-09-04 08:32:11 UTC
Too much refactoring makes backporting this for GCC 7 too risky.  The function
to patch there would anyone run into this is vect_get_constant_vectors.

So - fixed for GCC 8+.