-O3 and -ftree-vectorize

Tim Prince n8tm@aol.com
Thu Feb 6 22:21:00 GMT 2014

On 2/6/2014 1:51 PM, Uros Bizjak wrote:
> Hello!
> 4.9 does not enable -ftree-vectorize for -O3 (and Ofast) anymore. Is
> this intentional?
> $/ssd/uros/gcc-build/gcc/xgcc -B /ssd/uros/gcc-build/gcc -O3 -Q
> --help=optimizers
> ...
> -ftree-vectorize                      [disabled]
> ...
I'm seeing vectorization  but no output from -ftree-vectorizer-verbose, 
and no dot product vectorization inside omp parallel regions, with gcc 
g++ or gfortran 4.9.  Primary targets are cygwin64 and linux x86_64.
I've been unable to use -O3 vectorization with gcc, although it works 
with gfortran and g++, so use gcc -O2 -ftree-vectorize together with 
additional optimization flags which don't break.
I've made source code changes to take advantage of the new vectorization 
with merge() and ? operators; while it's useful for -march=core-avx2, 
it's sometimes a loss for -msse4.1.
gcc vectorization with #pragma omp parallel for simd is reasonably 
effective in my tests only on 12 or more cores.
#pragma omp simd reduction(max: ) is giving correct results but poor 
performance in my tests.

You've probably seen my gcc testresults posts.  The one major recent 
improvement is the ability to skip cilkplus tests on targets where it's 
totally unsupported.  Without cilk_for et al. even on "supported" 
targets cilkplus seems useless.
There are still lots of failing stabs tests on targets where those 
apparently aren't supported.

So there are some mysteries about what the developers intend.  I suppose 
this was posted on gcc list on account of such questions being ignored 
on gcc-help.

Tim Prince

More information about the Gcc mailing list