Bug 91837 - Wrong code with -ftree-loop-vectorize and -march=skylake-avx512 on some Intel machines
Summary: Wrong code with -ftree-loop-vectorize and -march=skylake-avx512 on some Intel...
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2019-09-20 12:19 UTC by Daniel Cooke
Modified: 2019-09-23 10:51 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Richard Biener 2019-09-20 12:30:27 UTC
For reference:

#include <iostream>
#include <vector>
#include <cstddef>
#include <algorithm>

struct Model
{
    int open, extend;
};

struct Cell
{
    int a, b;
};

typedef std::vector<std::vector<Cell>> DPMatrix;

void print(const DPMatrix& matrix)
{
    for (std::size_t i = 0; i < matrix.size(); ++i) {
        for (std::size_t j = 0; j < matrix[i].size(); ++j) {
            std::cout << '{' << matrix[i][j].a << ' ' << matrix[i][j].b << "} ";
        }
        std::cout << std::endl;
    }
}

DPMatrix init_dp_matrix(const std::size_t num_cols, const std::size_t num_rows, const Model& model)
{
    DPMatrix result(num_cols, DPMatrix::value_type(num_rows, Cell()));
    const int inf = model.open * std::max(num_cols, num_rows);
    for (int i = 1; i < num_cols; ++i) {
        result[i][0].b = model.open + (i - 1) * model.extend;
    }
    for (int j = 1; j < num_rows; ++j) {
        result[0][j].a = model.open + (j - 1) * model.extend;
    }
    return result;
}

int main()
{
    const Model model = {-8, -1};
    const DPMatrix matrix = init_dp_matrix(10, 2, model);
    print(matrix);
}
Comment 2 Richard Biener 2019-09-20 12:40:56 UTC
gcc-9-branch revision 275330 works fine for me, likewise the GCC 9.2 release.
Note I have to use -mprefer-vector-width=512 to get AVX512 instructions used.

> g++-9 t.C -O3 -march=native -mprefer-vector-width=512 -fopt-info-vec
t.C:35:23: optimized: loop vectorized using 64 byte vectors
t.C:32:23: optimized: loop vectorized using 64 byte vectors
/usr/include/c++/9/bits/stl_algobase.h:759:13: optimized: loop vectorized using 64 byte vectors

vs.

> g++-9 t.C -O3 -march=native -fopt-info-vec
t.C:35:23: optimized: loop vectorized using 32 byte vectors
t.C:32:23: optimized: loop vectorized using 32 byte vectors
/usr/include/c++/9/bits/stl_algobase.h:759:13: optimized: loop vectorized using 32 byte vectors

so, not confirmed.
Comment 3 Daniel Cooke 2019-09-20 20:14:53 UTC
I tried replicating the issue on a CentOS machine (AWS EC2) with the exact same CPU and got the correct output. However, I just fired up a fresh AWS EC2 instance running Ubuntu and I can replicate the bug again. Is it possible that the bug is CPU *and* OS specific?
Comment 4 H.J. Lu 2019-09-20 21:41:10 UTC
(In reply to Daniel Cooke from comment #3)
> I tried replicating the issue on a CentOS machine (AWS EC2) with the exact
> same CPU and got the correct output. However, I just fired up a fresh AWS
> EC2 instance running Ubuntu and I can replicate the bug again. Is it
> possible that the bug is CPU *and* OS specific?

You may run into

https://sourceware.org/bugzilla/show_bug.cgi?id=23465
Comment 5 Daniel Cooke 2019-09-21 11:28:30 UTC
(In reply to H.J. Lu from comment #4)
> (In reply to Daniel Cooke from comment #3)
> > I tried replicating the issue on a CentOS machine (AWS EC2) with the exact
> > same CPU and got the correct output. However, I just fired up a fresh AWS
> > EC2 instance running Ubuntu and I can replicate the bug again. Is it
> > possible that the bug is CPU *and* OS specific?
> 
> You may run into
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=23465

It does seem like this could be the issue. I upgraded my binutils to 2.32 (from 2.30) and the problem disappears. I'll do some more testing and report back.
Comment 6 Daniel Cooke 2019-09-23 10:49:39 UTC
Can confirm I'm no longer having any issues after upgrading binutils to 2.32.
Comment 7 Richard Biener 2019-09-23 10:51:46 UTC
Not a gcc bug.