[Bug c++/80859] Performance Problems with OpenMP 4.5 support

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed May 24 16:33:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Thorsten Kurth from comment #11)
> yes, you are right. I thought that map(tofrom:XXXX) is the default mapping
> but I might be wrong. In any case, teams is always 1. So this code is

Variables that aren't pointers nor scalars are still implicitly
map(tofrom:XXXX),
scalars are implicitly firstprivate(XXXX), pointers are map(alloc:ptr[0:0]).

> basically just data streaming  so there is no need for a detailed
> performance analysis. When I timed the code (not profiling it) the OpenMP
> 4.5 code had a tiny bit more overhead, but not significant. 
> However, we might nevertheless learn from that. 

What kind of compiler options you use?  -O2 -fopenmp, -O3 -fopenmp, -Ofast
-fopenmp, something different?  What ISA choice? -march=native, -mavx2, ...?
The 10x slowdown could most likely be explained by the inner loop being
vectorized in one case and not the other.  You aren't using #pragma omp
parallel for simd that you'd explicitly ask for vectorization e.g. even at -O2
-fopenmp.


More information about the Gcc-bugs mailing list