This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c++/80859] Performance Problems with OpenMP 4.5 support
- From: "jakub at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 24 May 2017 16:33:40 +0000
- Subject: [Bug c++/80859] Performance Problems with OpenMP 4.5 support
- Auto-submitted: auto-generated
- References: <bug-80859-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Thorsten Kurth from comment #13)
> the compiler options are just -fopenmp. I am sure it does not have to do
> anything with vectorization as I compare the code runtime with and without
> the target directives and thus vectorization should be the same between
> them. The remaining OpenMP sections are the same. In our work we have not
> seen 10x because of insufficient vectorization, it is usually because of
> cache locality but that is the same for OMP 4.5 and OMP 3 because the loops
> are not touched.
> I do not specify an ISA choice, but I will try specifying KNL now and will
> tell you what the compiler is going to do.
The compiler doesn't optimize by default (i.e. default is -O0), so if you are
measuring -O0 -fopenmp performance or code size, that is something that is
completely uninteresting. For -O0 the most important is compilation speed, not
quality of generated code. For runtime performance of generated code only -O2,
-O3 or -Ofast are optimization levels that make sense.