This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?
- From: "hubicka at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 07 Oct 2017 16:30:48 +0000
- Subject: [Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?
- Auto-submitted: auto-generated
- References: <bug-81614-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614
--- Comment #8 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I have tried the attached change at our periodic tester for haswell. It
switches codegen to one similar for pentimpro (assuming that renaming happens
on register parts as opposed to full registers).
Relevant run is Oct 6, 2017 20:00 UTC of Czerny at
https://gcc.opensuse.org/gcc-old/SPEC/CFP/sb-czerny-head-64-2006/recent.html
and
https://gcc.opensuse.org/gcc-old/SPEC/CINT/sb-czerny-head-64-2006/recent.html
It seems spec neutral. Because it models more closely what happens, perhaps
changing it makes sense?
Index: x86-tune.def
===================================================================
--- x86-tune.def (revision 253509)
+++ x86-tune.def (working copy)
@@ -48,7 +48,7 @@
over partial stores. For example preffer MOVZBL or MOVQ to load 8bit
value over movb. */
DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
- m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_INTEL
+ m_P4_NOCONA | m_BONNELL | m_SILVERMONT
| m_KNL | m_KNM | m_AMD_MULTIPLE | m_GENERIC)
/* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: This knob promotes all store
@@ -467,20 +467,20 @@
In current implementation the partial register stalls are not eliminated
very well - they can be introduced via subregs synthesized by combine
and can happen in caller/callee saving sequences. */
-DEF_TUNE (X86_TUNE_PARTIAL_REG_STALL, "partial_reg_stall", m_PPRO)
+DEF_TUNE (X86_TUNE_PARTIAL_REG_STALL, "partial_reg_stall", m_PPRO | m_CORE_ALL
| m_INTEL)
/* X86_TUNE_PROMOTE_QIMODE: When it is cheap, turn 8bit arithmetic to
corresponding 32bit arithmetic. */
DEF_TUNE (X86_TUNE_PROMOTE_QIMODE, "promote_qimode",
- ~m_PPRO)
+ ~(m_PPRO | m_CORE_ALL | m_INTEL))
/* X86_TUNE_PROMOTE_HI_REGS: Same, but for 16bit artihmetic. Again we avoid
partial register stalls on PentiumPro targets. */
-DEF_TUNE (X86_TUNE_PROMOTE_HI_REGS, "promote_hi_regs", m_PPRO)
+DEF_TUNE (X86_TUNE_PROMOTE_HI_REGS, "promote_hi_regs", m_PPRO | m_CORE_ALL |
m_INTEL)
/* X86_TUNE_HIMODE_MATH: Enable use of 16bit arithmetic.
On PPro this flag is meant to avoid partial register stalls. */
-DEF_TUNE (X86_TUNE_HIMODE_MATH, "himode_math", ~m_PPRO)
+DEF_TUNE (X86_TUNE_HIMODE_MATH, "himode_math", ~(m_PPRO | m_CORE_ALL |
m_INTEL))
/* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
directly to memory. */