in this example below the loop in the first method does not vectorize, the second does. struct Foo { float a; float b; void compute1(float * __restrict__ x, float const * __restrict__ y, int N) const; void compute2(float * __restrict__ x, float const * __restrict__ y, int N) const; }; void Foo::compute1(float * __restrict__ x, float const * __restrict__ y, int N) const { for (int i=0; i!=N; ++i) x[i] = a + b*y[i]; } void Foo::compute2(float * __restrict__ x, float const * __restrict__ y, int N) const { float la=a, lb=b; for (int i=0; i!=N; ++i) x[i] = la + lb*y[i]; } test/vectClass.cpp:11: note: not vectorized: loop contains function calls or data references that cannot be analyzed test/vectClass.cpp:10: note: vectorized 0 loops in function. vs test/vectClass.cpp:17: note: Profitability threshold is 5 loop iterations. test/vectClass.cpp:17: note: LOOP VECTORIZED. test/vectClass.cpp:15: note: vectorized 1 loops in function.
I just upgraded to c++ -v Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin10.8.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin10.8.0 Configured with: ./configure --enable-languages=c,c++,fortran --enable-lto --with-build-config=bootstrap-lto CFLAGS='-O2 -ftree-vectorize -fPIC' CXXFLAGS='-O2 -fPIC -ftree-vectorize -fvisibility-inlines-hidden' Thread model: posix gcc version 4.7.0 20110716 (experimental) (GCC) and it seem to vectorize fine… was stil not ok in gcc version 4.7.0 20110702 (experimental) (GCC) it still pretends to check for test/vectClass.cpp:11: note: Detected interleaving this_9(D)->a and this_9(D)->b test/vectClass.cpp:11: note: versioning for alias required: can't determine dependence between this_9(D)->a and *D.1537_8 test/vectClass.cpp:11: note: mark for run-time aliasing test between this_9(D)->a and *D.1537_8 test/vectClass.cpp:11: note: versioning for alias required: can't determine dependence between this_9(D)->b and *D.1537_8 test/vectClass.cpp:11: note: mark for run-time aliasing test between this_9(D)->b and *D.1537_8 I hope it will be addressed in PR49774
starting in GCC 6 can do it without aliasing checks. Starting in GCC 8, the prologue code has improved. So we can close this as fixed in GCC 6 really.