[Bug target/85829] [8/9 Regression] PARTIAL_REG_DEPENDENCY and MOVX were disabled for Haswell and newer processors

hjl at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu May 31 15:03:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85829

--- Comment #5 from Sebastian Peryt <sebastian.peryt at intel dot com> ---
I have made measurements on HSW comparing
-mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell to -Ofast
-mtune=haswell and I see improvements on EEMBC benchmarks.

automotive
=========
  aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
  aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).

networking
=========
  ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
  ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
  ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
  ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% (time).

telecom
=========
  fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
  fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
  fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).

--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> ---
Author: hjl
Date: Thu May 31 15:02:36 2018
New Revision: 261026

URL: https://gcc.gnu.org/viewcvs?rev=261026&root=gcc&view=rev
Log:
x86: Re-enable partial_reg_dependency and movx for Haswell

r254152 disabled partial_reg_dependency and movx for Haswell and newer
Intel processors.  r258972 restored them for skylake-avx512.  For Haswell,
movx improves performance.  But partial_reg_stall may be better than
partial_reg_dependency in theory.  We will investigate performance impact
of partial_reg_stall vs partial_reg_dependency on Haswell for GCC 9.  In
the meantime, this patch restores both partial_reg_dependency and mox for
Haswell in GCC 8.

On Haswell, improvements for EEMBC benchmarks with

-mtune-ctrl=movx,partial_reg_dependency -Ofast -march=haswell

vs

-Ofast -mtune=haswell

are

automotive
=========
  aifftr01 (default) - goodperf: Runtime improvement of   2.6% (time).
  aiifft01 (default) - goodperf: Runtime improvement of   2.2% (time).

networking
=========
  ip_pktcheckb1m (default) - goodperf: Runtime improvement of   3.8% (time).
  ip_pktcheckb2m (default) - goodperf: Runtime improvement of   5.2% (time).
  ip_pktcheckb4m (default) - goodperf: Runtime improvement of   4.4% (time).
  ip_pktcheckb512k (default) - goodperf: Runtime improvement of   4.2% (time).

telecom
=========
  fft00data_1 (default) - goodperf: Runtime improvement of   8.4% (time).
  fft00data_2 (default) - goodperf: Runtime improvement of   8.6% (time).
  fft00data_3 (default) - goodperf: Runtime improvement of   9.0% (time).

        PR target/85829
        * config/i386/x86-tune.def: Re-enable partial_reg_dependency
        and movx for Haswell.

Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/config/i386/x86-tune.def


More information about the Gcc-bugs mailing list