[Bug gcov-profile/90364] 521.wrf_r is 8-17% slower with PGO at -Ofast and native march/mtune
marxin at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Sep 8 15:54:18 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90364
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |marxin at gcc dot gnu.org
--- Comment #10 from Martin Liška <marxin at gcc dot gnu.org> ---
All right, I understand what goes wrong. The benchmark builds 2 binaries: wrf_r
and diffwrf_521. Both of them contain pretty much the same objects that *are*
built twice:
gfortran -c -o module_mp_wsm5.fppized.o -I. -I./netcdf/include -I./inc -O2
-march=native -std=legacy -fprofile-generate -fconvert=big-endian -fno-openmp
-g0 module_mp_wsm5.fppized.f90
then wrf_r is trained, module_mp_wsm5.fppized.gcda is properly created.
But then diffwrf_521 is invoked and the GCDA if overwritten:
$ export GCOV_ERROR_FILE=/tmp/wrf.txt
...
$ grep wsm5 /tmp/wrf.txt
libgcov profiling
error:/home/marxin/Programming/cpu2017/benchspec/CPU/521.wrf_r/build/build_peak_gcc-m64.0000/module_mp_wsm5.fppized.gcda:overwriting
an existing profile data with a different timestamp
That explains why we end up with a profile that has relatively low
sum_max=4450478, as shown the profile comes from a verification binary
diffwrf_521.
I don't have an easy solution for that. Maybe we can somehow drop
-fprofile-generate for diffwrf_521 binary. Is it possible?
More information about the Gcc-bugs
mailing list