This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c++/80859] New: Performance Problems with OpenMP 4.5 support
- From: "thorstenkurth at me dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 22 May 2017 20:33:50 +0000
- Subject: [Bug c++/80859] New: Performance Problems with OpenMP 4.5 support
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80859
Bug ID: 80859
Summary: Performance Problems with OpenMP 4.5 support
Product: gcc
Version: 6.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: thorstenkurth at me dot com
Target Milestone: ---
Dear Sir/Madam,
I am working on the Cori HPC system, a Cray XC-40 with intel Xeon Phi 7250. I
probably found a performance "bug" when using the OpenMP 4.5 target directives.
It seems to me that the GNU compiler generates unnecessary move and push
functions when a
#pragma omp target region is present but no offloading is used.
I have attached a test case to illustrate that problem. Please compile the
nested_test_omp_4dot5.x in the directory (don't be confused by the name, I am
not using nested OpenMP here). Then go into the corresponding .cpp file and
comment out the target-related directives (target teams and distribute),
compile again and then compare the assembly code. The code with the target
directives has more pushes and moves than the one without. I think I also place
the output of that process in the directory already, the files ending in .as.
The performance overhead is marginal here but I am currently working on a
Department of Energy performance portability project and I am exploring the
usefulness of OpenMP 4.5. The code we retargeting is a Geometric Multigrid in
the BoxLiv/AMReX framework and there the overhead is significant. I could
observe as much as 10x slowdown accumulated throughout the app. This code is
bigger and thus I do not want to demonstrate that here but I could send you an
invitation to the github repo if requested. In my opinion, if no offloading is
used, the compiler should just ignore the target region statements and just
default to plain OpenMP.
Please let me know what you think.
Best Regards
Thorsten Kurth
National Energy Research Scientific Computing Center
Lawrence Berkeley National Laboratory