Bug 70542 - [6 Regression] Wrong code with -O3 -mavx2.
Summary: [6 Regression] Wrong code with -O3 -mavx2.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 6.0
: P1 normal
Target Milestone: 6.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks: yarpgen
  Show dependency treegraph
 
Reported: 2016-04-05 07:35 UTC by Vsevolod Livinskii
Modified: 2021-11-01 23:07 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work: 4.9.4, 5.3.1
Known to fail:
Last reconfirmed: 2016-04-05 00:00:00


Attachments
Reproducer. (352 bytes, text/x-csrc)
2016-04-05 07:35 UTC, Vsevolod Livinskii
Details
Correct reproducer. (279 bytes, text/x-csrc)
2016-04-05 07:38 UTC, Vsevolod Livinskii
Details
gcc6-pr70542.patch (1.04 KB, patch)
2016-04-05 12:34 UTC, Jakub Jelinek
Details | Diff
gcc6-pr70542.patch (1.29 KB, patch)
2016-04-05 12:55 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vsevolod Livinskii 2016-04-05 07:35:15 UTC
Created attachment 38187 [details]
Reproducer.

Test case produces incorrect result with -O3 -mavx2 (and -march=knl and -march=skylake-avx512. Everything works fine with gcc version 4.9.4 20160401 (prerelease) (Revision=234686) and gcc version 5.3.1 20160401 (Revision=234686).

Output:
g++ -O2 -mavx2 repr.cpp; ./a.out
1540
> g++ -O3 -mavx2 repr.cpp; ./a.out
1204

GCC version:
> g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/export/users/vlivinsk/gcc-trunk/bin/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /export/users/vlivinsk/gcc-trunk/gcc/configure --with-arch=corei7 --with-cpu=corei7 --enable-clocale=gnu --with-system-zlib --enable-shared --with-demangler-in-ld --enable-cloog-backend=isl --with-fpmath=sse --enable-checking=release --enable-languages=c,c++,lto --with-gmp=/export/users/vlivinsk/gcc-trunk/gmp-6.1.0/bin --with-mpfr=/export/users/vlivinsk/gcc-trunk/mpfr-3.1.3/bin --with-mpc=/export/users/vlivinsk/gcc-trunk/mpc-1.0.3/bin --prefix=/export/users/vlivinsk/gcc-trunk/bin
Thread model: posix
gcc version 6.0.0 20160404 (experimental) (Revision=234705)

Test:
#include <cstdlib>

int a[520];
short b[482];
short c[480];
int d[963];
short e[320];

int main() {
    for (int i = 0; i < 520; ++i)
        a[i] = -636544305;

    for (int i = 0; i < 386; ++i)
        b[i] = -31804;

    for (long i = 1; i <= 112; ++i) {
        c[i] = b[i] >> ((a[i] & 1587842570) - 1510214139);
        if (a[i])
            d[i] = i;
        e[i] = 7 << ((2312631697 - b[i]) - 2312663500);
    }

    unsigned long long g = 0;
    for (int i = 0; i < 111; ++i) {
        g = e[i] + g;
    }

    if (g != 1540)
        abort();
}
Comment 1 Vsevolod Livinskii 2016-04-05 07:38:47 UTC
Created attachment 38188 [details]
Correct reproducer.

I've mixed up reproducers.
Comment 2 Jakub Jelinek 2016-04-05 08:30:34 UTC
Looking into this.
Comment 3 Jakub Jelinek 2016-04-05 08:59:40 UTC
Started with r233068.
Slightly cleaned up testcase:
int a[113], d[113];
short b[113], c[113], e[113];

int
main ()
{
  int i;
  long j;
  unsigned long long g = 0;
  for (i = 0; i < 113; ++i)
    {
      a[i] = -636544305;
      b[i] = -31804;
    }
  for (j = 1; j <= 112; ++j)
    {
      c[j] = b[j] >> ((a[j] & 1587842570) - 1510214139);
      if (a[j])
	d[j] = j;
      e[j] = 7 << ((2312631697 - b[j]) - 2312663500);
    }
  for (i = 0; i < 111; ++i)
    g = e[i] + g;
  if (g != 1540)
    __builtin_abort ();
  return 0;
}
Comment 4 Jakub Jelinek 2016-04-05 10:08:09 UTC
This goes wrong during the REE pass.
Comment 5 Jakub Jelinek 2016-04-05 12:34:57 UTC
Created attachment 38190 [details]
gcc6-pr70542.patch

Untested fix.
Comment 6 Jakub Jelinek 2016-04-05 12:55:20 UTC
Created attachment 38191 [details]
gcc6-pr70542.patch

Perhaps better fix.  Looking at PR64286, the comment is right, if we change somehow the definition for VECTOR_MODE_P, we need to make sure all uses are modified, or none.  But there is tons of reasons why some of them could fail, e.g. for the copy_needed case there is lots of tests that can fail, but even for the !copy_needed case it could fail to get recognized etc., and we are unable to handle all the uses as a single transaction.
Comment 7 Jakub Jelinek 2016-04-05 17:05:55 UTC
Author: jakub
Date: Tue Apr  5 17:05:23 2016
New Revision: 234756

URL: https://gcc.gnu.org/viewcvs?rev=234756&root=gcc&view=rev
Log:
	PR rtl-optimization/70542
	* ree.c (add_removable_extension): For VECTOR_MODE_P punt
	if there are any uses other than insn or debug insns.

	* gcc.dg/torture/pr70542.c: New test.
	* gcc.target/i386/avx2-pr70542.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr70542.c
    trunk/gcc/testsuite/gcc.target/i386/avx2-pr70542.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ree.c
    trunk/gcc/testsuite/ChangeLog
Comment 8 Jakub Jelinek 2016-04-05 17:09:16 UTC
Fixed.