See my question on StackOverflow: https://stackoverflow.com/questions/58026153/why-does-march-native-corrupt-my-program?noredirect=1#comment102457501_58026153
For reference: #include <iostream> #include <vector> #include <cstddef> #include <algorithm> struct Model { int open, extend; }; struct Cell { int a, b; }; typedef std::vector<std::vector<Cell>> DPMatrix; void print(const DPMatrix& matrix) { for (std::size_t i = 0; i < matrix.size(); ++i) { for (std::size_t j = 0; j < matrix[i].size(); ++j) { std::cout << '{' << matrix[i][j].a << ' ' << matrix[i][j].b << "} "; } std::cout << std::endl; } } DPMatrix init_dp_matrix(const std::size_t num_cols, const std::size_t num_rows, const Model& model) { DPMatrix result(num_cols, DPMatrix::value_type(num_rows, Cell())); const int inf = model.open * std::max(num_cols, num_rows); for (int i = 1; i < num_cols; ++i) { result[i][0].b = model.open + (i - 1) * model.extend; } for (int j = 1; j < num_rows; ++j) { result[0][j].a = model.open + (j - 1) * model.extend; } return result; } int main() { const Model model = {-8, -1}; const DPMatrix matrix = init_dp_matrix(10, 2, model); print(matrix); }
gcc-9-branch revision 275330 works fine for me, likewise the GCC 9.2 release. Note I have to use -mprefer-vector-width=512 to get AVX512 instructions used. > g++-9 t.C -O3 -march=native -mprefer-vector-width=512 -fopt-info-vec t.C:35:23: optimized: loop vectorized using 64 byte vectors t.C:32:23: optimized: loop vectorized using 64 byte vectors /usr/include/c++/9/bits/stl_algobase.h:759:13: optimized: loop vectorized using 64 byte vectors vs. > g++-9 t.C -O3 -march=native -fopt-info-vec t.C:35:23: optimized: loop vectorized using 32 byte vectors t.C:32:23: optimized: loop vectorized using 32 byte vectors /usr/include/c++/9/bits/stl_algobase.h:759:13: optimized: loop vectorized using 32 byte vectors so, not confirmed.
I tried replicating the issue on a CentOS machine (AWS EC2) with the exact same CPU and got the correct output. However, I just fired up a fresh AWS EC2 instance running Ubuntu and I can replicate the bug again. Is it possible that the bug is CPU *and* OS specific?
(In reply to Daniel Cooke from comment #3) > I tried replicating the issue on a CentOS machine (AWS EC2) with the exact > same CPU and got the correct output. However, I just fired up a fresh AWS > EC2 instance running Ubuntu and I can replicate the bug again. Is it > possible that the bug is CPU *and* OS specific? You may run into https://sourceware.org/bugzilla/show_bug.cgi?id=23465
(In reply to H.J. Lu from comment #4) > (In reply to Daniel Cooke from comment #3) > > I tried replicating the issue on a CentOS machine (AWS EC2) with the exact > > same CPU and got the correct output. However, I just fired up a fresh AWS > > EC2 instance running Ubuntu and I can replicate the bug again. Is it > > possible that the bug is CPU *and* OS specific? > > You may run into > > https://sourceware.org/bugzilla/show_bug.cgi?id=23465 It does seem like this could be the issue. I upgraded my binutils to 2.32 (from 2.30) and the problem disappears. I'll do some more testing and report back.
Can confirm I'm no longer having any issues after upgrading binutils to 2.32.
Not a gcc bug.