This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch] [x86_64] znver1 enablement
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "Kumar, Venkataramanan" <Venkataramanan dot Kumar at amd dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 30 Sep 2015 14:40:11 +0200
- Subject: Re: [Patch] [x86_64] znver1 enablement
- Authentication-results: sourceware.org; auth=none
- References: <7794A52CE4D579448B959EED7DD0A4723DD5F914 at satlexdag06 dot amd dot com>
On Wed, Sep 30, 2015 at 12:05 PM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Maintainers,
>
> The attached patch enables -march=znver1 (AMD family 17h Zen processor).
>
> Costs and tunings are copied from bdver4, but we will be adjusting them later for znver1.
> Also a basic scheduler description for znver1 is added and we will update this as we get more information.
>
> Testing :
> GCC bootstrap and gcc regression passes on x86_64-pc-linux-gnu.
> GCC bootstrap passed with "make BOOT_CFLAGS= -O2 -g -march=znver1 -mno-adx -mno-mwaitx -mno-clzero -mno-sha -mno-clflushopt -mno-rdseed" on x86_64-pc-linux-gnu .
>
> Built SPEC2006 benchmarks with -march=znver1 and ran it on bdver4 machine.
>
> Wrf and Calculix failed to compile but looks like a general register allocation issue not restricted to -march=znver1.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67717
>
> ChangeLog:
> * config.gcc (i[34567]86-*-linux* | ...): Add znver1.
> (case ${target}): Add znver1.
> * config/i386/cpuid.h(bit_CLZERO): Define.
> * config/i386/driver-i386.c: (host_detect_local_cpu): Let
> -march=native recognize znver1 processors.
> * config/i386/i386-c.c (ix86_target_macros_internal): Add
> znver1, clzero def_and_undef.
> * config/i386/i386.c (struct processor_costs znver1_cost): New.
> (m_znver1): New definition.
> (m_AMD_MULTIPLE): Includes m_znver1.
> (processor_target_table): Add znver1 entry.
> (ix86_target_string) : Add clzero entry.
> (static const char *const cpu_names): Add znver1 entry.
> (ix86_option_override_internal): Add znver1 instruction sets.
> (PTA_CLZERO) : New definition.
> (ix86_option_override_internal): Handle new clzerooption.
> (ix86_issue_rate): Add znver1.
> (ix86_adjust_cost): Add znver1.
> (get_builtin_code_for_version): Set priority for PROCESSOR_ZNVER1.
> (ia32_multipass_dfa_lookahead): Add znver1.
> (enum processor_model): Add M_AMDFAM17H_znver1.
> (struct _arch_names_table): Add M_AMDFAM17H_znver1.
> (has_dispatch): Add znver1.
> * config/i386/i386.h (TARGET_znver1): New definition.
> (TARGET_CLZERO): Define.
> (TARGET_CLZERO_P): Define.
> (struct ix86_size_cost): Add TARGET_ZNVER1.
> (enum processor_type): Add PROCESSOR_znver1.
> * config/i386/i386.md (define_attr "cpu"): Add znver1.
> (set_attr znver1_decode): New definitions for znver1.
> * config/i386/i386.opt (flag_dispatch_scheduler): Add znver1.
> (mclzero): New.
> * config/i386/mmx.md (set_attr znver1_decode): New definitions
> for znver1.
> * config/i386/sse.md (set_attr znver1_decode): Likewise.
> * config/i386/x86-tune.def: Add znver1 tunings.
> * config/i386/znver1.md: Introduce znver1 cpu and include new md file.
> * gcc/doc/extend.texi: Add details about znver1.
> * gcc/doc/invoke.texi: Add details about znver1.
>
> Ok for trunk?
Please remove changes to get_builtin_code_for_version and
fold_builtin_cpu. They should be committed in a future patch, together
with relevant libgcc changes to libgcc/config/i386/cpuinfo.c.
+++ b/gcc/config.gcc
@@ -592,7 +592,7 @@ pentium4 pentium4m pentiumpro prescott iamcu"
# 64-bit x86 processors supported by --with-arch=. Each processor
# MUST be separated by exactly one space.
x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
-bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
+bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona znver1 \
core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
Please group the new processor with other AMD processors.
+++ b/gcc/config/i386/i386-c.c
| PTA_MOVBE | PTA_MWAITX},
- {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
+ {"znver1", PROCESSOR_ZNVER1, CPU_ZNVER1,
wrong indetation.
+++ b/gcc/config/i386/driver-i386.c
@@ -414,6 +414,7 @@ const char *host_detect_local_cpu (int argc, const
char **argv)
unsigned int has_avx512dq = 0, has_avx512bw = 0, has_avx512vl = 0;
unsigned int has_avx512vbmi = 0, has_avx512ifma = 0, has_clwb = 0;
unsigned int has_pcommit = 0, has_mwaitx = 0;
+ unsigned int has_clzero = 0;
This option should be added to the options string. Please see the end
of driver-i386.c on how other options are precessed.
+++ b/gcc/config/i386/i386.opt
@@ -569,8 +569,8 @@ the function.
mdispatch-scheduler
Target RejectNegative Var(flag_dispatch_scheduler)
-Do dispatch scheduling if processor is bdver1 or bdver2 or bdver3 or
bdver4 and Haifa scheduling
-is selected.
+Do dispatch scheduling if processor is bdver1 or bdver2 or bdver3 or bdver4
+or znver1 and Haifa scheduling is selected.
Please reword the above with commas: ... if processor is bdver1,
bdver2, bdver3, bdver4 or znver1 ...
+++ b/gcc/config/i386/x86-tune.def
@@ -59,7 +59,7 @@ DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY,
"partial_reg_dependency",
that can be partly masked by careful scheduling of moves. */
DEF_TUNE (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY, "sse_partial_reg_dependency",
m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_AMDFAM10
- | m_BDVER | m_GENERIC)
+ | m_BDVER | m_GENERIC | m_ZNVER1)
[...]
Please leave m_GENERIC as the last entry. Also, please group AMD masks
together in some consistent way throughout the x86-tune.def file.
+++ b/gcc/config/i386/znver1.md
+;; SSE converts
+(define_insn_reservation "znver1_ssecvtsf_si_load" 9
"SSE conversions" ?
+++ b/gcc/doc/extend.texi
@@ -16924,6 +16924,9 @@ AMD Family 15h Bulldozer version 3.
@item bdver4
AMD Family 15h Bulldozer version 4.
+@item znver1
+AMD Family 17h Zen version 1.
+
@item btver2
AMD Family 16h CPU.
@end table
Please remove the not yet implemented change above.
Otherwise, the patch looks OK. Please repost it with the above changes
for the final approval.
Uros.