[Bug target/94406] 503.bwaves_r is 11% slower on Zen2 CPUs than GCC 9 with -Ofast -march=native

jamborm at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Mar 30 16:01:30 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94406

--- Comment #1 from Martin Jambor <jamborm at gcc dot gnu.org> ---
For the record, the collected profiles both for the traditional
"cycles:u" event and (originally unintended) "ls_stlf:u" event are
below:

-Ofast -march=native -mtune=native

# Samples: 894K of event 'cycles:u'
# Event count (approx.): 735979402525
#
# Overhead       Samples  Command          Shared Object                 Symbol 
# ........  ............  ...............  ............................ 
.................................
#
    67.18%        599542  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
mat_times_vec_
    11.40%        102686  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
shell_
    11.37%        101388  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
bi_cgstab_block_
     6.95%         62694  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
jacobian_
     1.88%         16957  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
flux_
     1.01%          9023  bwaves_r_peak.e  libc-2.31.so                  [.]
__memset_avx2_unaligned


# Samples: 769K of event 'ls_stlf:u'
# Event count (approx.): 154704730574
#
# Overhead       Samples  Command          Shared Object                 Symbol 
# ........  ............  ...............  ............................ 
....................................
#
    94.59%        612921  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
mat_times_vec_
     1.83%         88259  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
shell_
     1.12%         13615  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
flux_
     1.11%         43093  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
jacobian_
     1.05%          8746  bwaves_r_peak.e  libc-2.31.so                  [.]
__memset_avx2_unaligned



-Ofast -march=native -mtune=native --param vect-epilogues-nomask=0

# Samples: 816K of event 'cycles:u'
# Event count (approx.): 671104061807
#
# Overhead       Samples  Command          Shared Object                 Symbol 
# ........  ............  ...............  ............................ 
.................................
#
    64.07%        521532  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
mat_times_vec_
    12.50%        102670  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
shell_
    12.39%        100777  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
bi_cgstab_block_
     7.60%         62641  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
jacobian_
     2.06%         16925  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
flux_
     1.17%          9531  bwaves_r_peak.e  libc-2.31.so                  [.]
__memset_avx2_unaligned

# Samples: 705K of event 'ls_stlf:u'
# Event count (approx.): 55009340780
#
# Overhead       Samples  Command          Shared Object                 Symbol 
# ........  ............  ...............  ............................ 
..............................
#
    86.26%        532930  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
mat_times_vec_
     5.15%         88270  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
shell_
     3.17%         13696  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
flux_
     3.06%         57149  bwaves_r_peak.e  bwaves_r_peak.experiment-m64  [.]
jacobian_
     1.59%          9226  bwaves_r_peak.e  libc-2.31.so                  [.]
__memset_avx2_unaligned


More information about the Gcc-bugs mailing list