Bug 93734

Summary: [9 Regression] Invalid code generated with -O2 -march=haswell -ftree-vectorize
Product: gcc Reporter: bartoldeman
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: amonakov, jakub, rsandifo
Priority: P2 Keywords: wrong-code
Version: 8.3.0   
Target Milestone: 10.0   
Host: Target: x86_64-*-*
Build: Known to work: 10.0, 7.5.0
Known to fail: 8.1.0, 8.3.1, 9.1.0, 9.2.1 Last reconfirmed: 2020-02-13 00:00:00
Bug Depends on: 92420    
Bug Blocks: 53947    
Attachments: Fortran code that prints 0 if correct, and -9 if miscompiled

Description bartoldeman 2020-02-13 13:46:48 UTC
Created attachment 47837 [details]
Fortran code that prints 0 if correct, and -9 if miscompiled

The attached code prints -9.0000000000000000 if compiled using

gfortran -O2 -march=haswell -ftree-vectorize bug.f90 -o bug
./bug
  -9.0000000000000000
using
GNU Fortran (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

also reproduceable on GCC 9.2.0, but not with GCC 7.3.0 and earlier.

The correct answer is 1-1=0.

(I found this issue first when compiling the reference BLAS using those options and running the "zblat2" tests, the test is a much reduced version of ztrsv, see http://www.netlib.org/lapack/explore-html/dc/dc1/group__complex16__blas__level2_ga99cc66f0833474d6607e6ea7dbe2f9bd.html#ga99cc66f0833474d6607e6ea7dbe2f9bd)
Comment 1 Richard Biener 2020-02-13 14:02:06 UTC
It works fine with GCC 7.5.
Comment 2 Richard Biener 2020-02-13 14:06:47 UTC
Hmm, it seems to be fixed on trunk.

Testcase that aborts on failure, we probably have a duplicate.

subroutine test(incx)
  implicit none
  integer i,incx,jx
  double complex a(5),x(9),temp

  a(1:4)=1
  a(5)=10
  x(1:9)=1

  jx = 9
  temp = x(9)
  do i = 4,1,-1
     jx = jx - incx
     x(jx) = x(jx) - temp*a(i)
  enddo

  if (x(5).ne.0) call abort
end subroutine test

program bug
  call test(2)
end program bug
Comment 3 Alexander Monakov 2020-02-13 14:26:28 UTC
I tried to make an equivalent C testcase, but complex ops don't map 1:1 from Fortran, so it's a bit difficult. Nevertheless, here's a somewhat similar testcase that aborts on 8/9, works on trunk, but IR and resulting assembly look quite different:

( needs -O2 -ftree-vectorize -mfma -fcx-limited-range )

__attribute__((noipa))
static
_Complex double
test(_Complex double * __restrict a,
     _Complex double * __restrict x,
     _Complex double t, long jx)
{
    long i, j;

    for (j = 6, i = 3; i>=0; i--, j-=jx)
        x[j] -= t*a[i];

    return x[4];
}

int main()
{
    _Complex double a[5] = {1, 1, 1, 1, 10};
    _Complex double x[9] = {1,1,1,1,1,1,1,1,1};
    if (test(a, x, 1, 2))
        __builtin_abort();
}
Comment 4 Jakub Jelinek 2020-02-13 15:50:52 UTC
#c2 isn't miscompiled since r10-4543-g599bd99078439b9f11cb271aa919844318381ec5
and the miscompilation started with r8-6708-g85c5e2f576fd41e1ab5620cde3c63b3ca6673bea
Comment 5 Jakub Jelinek 2020-03-04 09:43:17 UTC
GCC 8.4.0 has been released, adjusting target milestone.
Comment 6 Jakub Jelinek 2021-05-14 09:52:53 UTC
GCC 8 branch is being closed.
Comment 7 Richard Biener 2021-06-01 08:16:26 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 8 Mikael Pettersson 2021-07-19 15:04:12 UTC
I can't reproduce the wrong code using either the fortran test case in #c2 or the C one in #c3 with gcc-9.4.0 on Kaby Lake R. If I revert the PR92420 fix both test cases do reproduce the wrong code. Thus I think this was fixed by PR92420 in gcc-8.4.0 and gcc-9.3.0.
Comment 9 Richard Biener 2022-05-27 08:44:05 UTC
Fixed for GCC 10.