I built spec 2006 on a DD2.2 power9 and ran it. I noticed that the milc benchmark was 2% slower using either the trunk or the GCC 8 branch (subversion id 262483) if I compiled the code using -mcpu=power9 compared to -mcpu=power8.
Using trunk on a dedicated DD2.2 power9, I get the following performance comparisons: -mcpu=power8 -mcpu=power9 28.92 28.97 28.37 28.99 28.13 28.26 29.06 28.12 28.8 28.23 28.9 28.69 28.37 28.48 28.3 28.08 delta Percent average 28.60625 28.4775 0.12875 0.45%
Using the GCC8 branch, svn version id 262483 (the same version tested by Michael), I'm getting the following results: -mcpu=power8 -mcpu=power9 28.57 28.79 28.41 28.61 28.54 28.21 28.53 28.55 29.02 28.59 28.54 27.34 28.25 26.63 28.56 29.13 delta Percent average 28.5525 28.23125 0.32125 1.13% As with my trunk measurements, I'm not seeing a 2% difference. Rnd I am seeing that targeting power9 produces slightly better performance than targeting power8. It may be that we're running with different optimization flags. I used OPTIMIZE = -Ofast -mcpu=power9 (or -mcpu=power8) LDOPT = -m64 -Wl,-q -Wl,-rpath=%{BASE_DIR}/lib64 I'm inclined to close this issue unless Michael can point me to a different set of options to explore...
The options I use for spec are: -O3 -fpeel-loops -funroll-loops -ftree-vectorize -fvect-cost-model -msave-toc-indirect -mno-pointers-to-nested-functions -fno-aggressive-loop-optimizations -ffast-math -mveclibabi=mass -mrecip=rsqrt -mcpu=power<x> For C files I use: -fgnu89-inline For C++ files I use: -std=gnu++98 For Fortran files I use: -fstack-arrays I use -fno-strict-aliasing on milc (and perlbench) due to it playing pointer games that earlier compilers would generate the wrong code for. If memory serves, the -fno-strict-aliasing may not show the bug on power{7,8,9} systems. I know in the perlbench case, the code in spec violates the ISO C standard. I don't recall what the milc code is. I use -fno-aggressive-loop-optimizations because some of the benchmarks as written go beyond the end of arrays, and GCC over-optimizes these. I use version 8.1.3 of the MASS library. However, milc is not one of the benchmarks that heavily use the math library, so you can omit using MASS and -mveclibabi=mass.
There are aspects of Michael's recent comment that I may not fully understand. I checked the source for milc, and it is C, so I added -fgnu89-inline to the list of OPTIMIZE options. Then I reran my tests with gcc8 (svn version 262483) on a DD2.2 power9 machine. OPTIMIZE = -O3 -fpeel-loops -funroll-loops -ftree-vectorize -fvect-cost-\ model -fno-strict-aliasing -msave-toc-indirect -mno-pointers-to-nested-function\ s -fno-aggressive-loop-optimizations -ffast-math -mveclibabi=mass -mrecip=rsqrt\ -fgnu89-inline -mcpu=power9 (vs. -mcpu=power8) LDOPT = -m64 -Wl,-q -Wl,-rpath=%{BASE_DIR}/lib64 I'm still not seeing the performance degradation Michael saw. Here are my most recent results: gcc8 gcc9 28.79 28.14 29.01 28.84 28.51 28.5 28.55 28.39 29.02 29.07 29.1 28.51 delta % delta average 28.83 28.575 0.255 0.88% Does anyone see anything I may be doing wrong?
I apologize for an error in the previous comment. The two columns should have been labeled -mcpu=power8 (left) and -mcpu=power9 (right) instead of gcc8 and gcc9.
I should also clarify regarding all of the above comments that the numbers I have been reporting are the spec ratios. I had misunderstood that these ratios were encoded such that smaller values represented better performance. So some of my "interpretation remarks" are incorrect. Still, my measurements do not show the 2% difference that Michael observed, so there remains a question of whether there is enough of a performance change to merit further exploration.
Not confirmed at this time. Let's close it until we have something more definitive to look at.