[Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune

marxin at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Sep 8 15:54:18 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org

--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
All right, I understand what goes wrong. The benchmark builds 2 binaries: wrf_r
and diffwrf_521. Both of them contain pretty much the same objects that *are*
built twice:

gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
-march=native -std=legacy -fprofile-generate -fconvert=big-endian -fno-openmp
-g0 module_mp_wsm5.fppized.f90

then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
But then diffwrf_521 is invoked and the GCDA if overwritten:

$ export GCOV_ERROR_FILE=/tmp/wrf.txt
...
$ grep wsm5 /tmp/wrf.txt
libgcov profiling
error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting
an existing profile data with a different timestamp

That explains why we end up with a profile that has relatively low
sum_max=4450478, as shown the profile comes from a verification binary 
 diffwrf_521.

I don't have an easy solution for that. Maybe we can somehow drop
-fprofile-generate for diffwrf_521 binary. Is it possible?


More information about the Gcc-bugs mailing list