Bug 80815 - wrong code because of broken runtime alias check in vectorizer
Summary: wrong code because of broken runtime alias check in vectorizer
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 8.0
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2017-05-18 13:51 UTC by bin cheng
Modified: 2017-08-07 13:47 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
sparc-sun-solaris2.12 pr80815-3.c.156t.vect (4.01 KB, text/plain)
2017-06-02 11:58 UTC, Rainer Orth
Details

Note You need to log in before you can comment on or make changes to this bug.
Description bin cheng 2017-05-18 13:51:40 UTC
Hi,
I was suspecting that vect_prune_runtime_alias_check_list is broken, now I can create a test case for it:
#include <stdio.h>

int arr[2048];

int
foo (int *a, int *b)
{
  int i;
  int *a1 = a;
  int *a0 = a1 - 512;
  for (i = 0; i < 500; i++)
    {
      *b = *a0 + *a1;
      b++;
      a0--;
      a1--;
    }
  return 0;
}

int main (void)
{
  int *a = &arr[1027];
  int *b = &arr[1024];

  int i;
  for (i = 0; i < 2048; i++)
    arr[i] = i;

  foo (a, b);

  for (i = 0; i < 2048; i++)
    fprintf (stdout, "%d: %d\n", i, arr[i]);

  return 0;
}

Compiled on x86_64 with below two commands:

Binary 1)
$ ../gcc -O2 -fno-inline ../x.c -o x.exe
$ ./x.exe >/tmp/x

Binary 2)
../gcc -Ofast -fno-inline ../x.c -o y.exe
$ ./y.exe >/tmp/y

We have:
$ diff /tmp/x /tmp/y
1027,1028c1027,1028
< 1026: 2053
< 1027: 2054
---
> 1026: 1538
> 1027: 1536

I think the root cause is in below code:

	  /* Generally the new segment length is the maximum of the
	     left segment size and the right segment size plus the distance.
	     ???  We can also build tree MAX_EXPR here but it's not clear this
	     is profitable.  */
	  else if (tree_fits_uhwi_p (dr_a1->seg_len)
		   && tree_fits_uhwi_p (dr_a2->seg_len))
	    {
	      unsigned HOST_WIDE_INT seg_len_a1 = tree_to_uhwi (dr_a1->seg_len);
	      unsigned HOST_WIDE_INT seg_len_a2 = tree_to_uhwi (dr_a2->seg_len);
	      dr_a1->seg_len = size_int (MAX (seg_len_a1, diff + seg_len_a2));
	      do_remove = true;
	    }

Here dr_a1 is *a0, dr_a2 is *a1 and dr_b1/dr_b2 is *b, and their ranges is like

-------------------------------------------------------------->
    a0_en      a0        a1_end    b   a1       b_end

It should merge range_a0/range_a1 and result in range of a1, but because:
1) seg_len_a1 wraps to a large unsigned integer;
2) seg_len_a2 wraps to a large unsigned integer;
3) diff + seg_len_a2 wraps again to a small unsigned integer;
The program falsely merges ranges to range_a0.  As a result, loop is vectorized because range_a0 and range_b are considered not alias to each other.  This is wrong because *a1 and *b overlaps at the first several iterations and have circle in dependence.
Comment 1 bin cheng 2017-05-19 09:53:17 UTC
GCC uses vect_factor as minimal segment length for dr_b when merging alias pairs, I think it could be relaxed to vect_factor * abs (DR_STEP (dr_b)).  Below test shows this change can merge additional alias checks:

#include <stdio.h>

int arr[2048];

int
foo (int *a, int *b, int len)
{
  int i;
  int *a1 = a;
  int *a0 = a1 - 4;
  for (i = 0; i < len; i++)
    {
      *b = *a0 + *a1;
      b--;
      a0++;
      a1++;
    }
  return 0;
}

int main (void)
{
  int *a = &arr[1027];
  int *b = &arr[1024];

  int i;
  for (i = 0; i < 2048; i++)
    arr[i] = i;

  foo (a, b, 500);

  for (i = 0; i < 2048; i++)
    fprintf (stdout, "%d: %d\n", i, arr[i]);

  return 0;
}

$ ./gcc -Ofast -fno-inline ./z.c -o z.exe -fdump-tree-vect-details

Of course, alias check should fail and vectorized loop in foo should never be executed.
Thanks,
Comment 2 rguenther@suse.de 2017-05-19 10:17:33 UTC
On Fri, 19 May 2017, amker at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80815
> 
> --- Comment #1 from amker at gcc dot gnu.org ---
> GCC uses vect_factor as minimal segment length for dr_b when merging alias
> pairs, I think it could be relaxed to vect_factor * abs (DR_STEP (dr_b)). 
> Below test shows this change can merge additional alias checks:

But it would make cases aliasing that are not (that's always a trade off
between more precise result and less runtime checking).
Comment 3 bin cheng 2017-05-19 10:27:14 UTC
(In reply to rguenther@suse.de from comment #2)
> On Fri, 19 May 2017, amker at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80815
> > 
> > --- Comment #1 from amker at gcc dot gnu.org ---
> > GCC uses vect_factor as minimal segment length for dr_b when merging alias
> > pairs, I think it could be relaxed to vect_factor * abs (DR_STEP (dr_b)). 
> > Below test shows this change can merge additional alias checks:
> 
> But it would make cases aliasing that are not (that's always a trade off
> between more precise result and less runtime checking).

Hmm, it only does compilation time (and still conservative) check if segment dr_b resides in the gap between dr_a1 and dr_a2.  the overall effect would be equal to checking <dr_a1, dr_b>, <gap_segment, dr_b> and <dr_a2, dr_b>?
Comment 4 bin cheng 2017-05-19 14:38:02 UTC
Also the three cases:
	  /* If the left segment does not extend beyond the start of the
	     right segment the new segment length is that of the right
	     plus the segment distance.  */
	  if (tree_fits_uhwi_p (dr_a1->seg_len)
	      && compare_tree_int (dr_a1->seg_len, diff) <= 0)
	    {
	      dr_a1->seg_len = size_binop (PLUS_EXPR, dr_a2->seg_len,
					   size_int (diff));
	      do_remove = true;
	    }
	  /* Generally the new segment length is the maximum of the
	     left segment size and the right segment size plus the distance.
	     ???  We can also build tree MAX_EXPR here but it's not clear this
	     is profitable.  */
	  else if (tree_fits_uhwi_p (dr_a1->seg_len)
		   && tree_fits_uhwi_p (dr_a2->seg_len))
	    {
	      unsigned HOST_WIDE_INT seg_len_a1 = tree_to_uhwi (dr_a1->seg_len);
	      unsigned HOST_WIDE_INT seg_len_a2 = tree_to_uhwi (dr_a2->seg_len);
	      dr_a1->seg_len = size_int (MAX (seg_len_a1, diff + seg_len_a2));
	      do_remove = true;
	    }
	  /* Now we check if the following condition is satisfied:

	     DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B

	     where DIFF = DR_A2_INIT - DR_A1_INIT.  However,
	     SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
	     have to make a best estimation.  We can get the minimum value
	     of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
	     then either of the following two conditions can guarantee the
	     one above:

	     1: DIFF <= MIN_SEG_LEN_B
	     2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B  */
	  else
	    {
	      unsigned HOST_WIDE_INT min_seg_len_b
		= (tree_fits_uhwi_p (dr_b1->seg_len)
		   ? tree_to_uhwi (dr_b1->seg_len)
		   : factor);

	      if (diff <= min_seg_len_b
		  || (tree_fits_uhwi_p (dr_a1->seg_len)
		      && diff - tree_to_uhwi (dr_a1->seg_len) < min_seg_len_b))
		{
		  dr_a1->seg_len = size_binop (PLUS_EXPR,
					       dr_a2->seg_len, size_int (diff));
		  do_remove = true;
		}
	    }

Can be merged together.  the merged dr_a1->seg_len is dr_a2->seg_len + diff, with an exception checking MAX (seg_len_a1, diff + seg_len_a2).

Of course, negative step still needs to be handled.
Comment 5 bin cheng 2017-05-26 14:18:58 UTC
Author: amker
Date: Fri May 26 14:18:26 2017
New Revision: 248512

URL: https://gcc.gnu.org/viewcvs?rev=248512&root=gcc&view=rev
Log:
	PR tree-optimization/80815
	* tree-data-ref.c (prune_runtime_alias_test_list): Simplify condition
	for merging runtime alias checks.  Handle negative DR_STEPs.
	gcc/testsuite
	* gcc.dg/vect/pr80815-1.c: New test.
	* gcc.dg/vect/pr80815-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr80815-1.c
    trunk/gcc/testsuite/gcc.dg/vect/pr80815-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-data-ref.c
Comment 6 Rainer Orth 2017-06-02 11:58:41 UTC
Created attachment 41456 [details]
sparc-sun-solaris2.12 pr80815-3.c.156t.vect
Comment 7 bin cheng 2017-06-02 12:41:19 UTC
(In reply to Rainer Orth from comment #6)
> Created attachment 41456 [details]
> sparc-sun-solaris2.12 pr80815-3.c.156t.vect

Thanks for reporting, I will investigate it and disable on targets if it's not a general case.

Thanks.
Comment 8 bin cheng 2017-08-02 16:51:23 UTC
vect_perm added for the the test case.  It should be bypassed now on sparc-sun-solaris2.12?
Comment 9 bin cheng 2017-08-07 13:47:50 UTC
Fixed.