This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch] [x86_64] znver1 enablement


On Wed, Sep 30, 2015 at 12:05 PM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Maintainers,
>
> The attached patch enables -march=znver1 (AMD family 17h Zen processor).
>
> Costs and tunings are copied from bdver4,  but we will be adjusting them later for znver1.
> Also a basic scheduler description for znver1 is added and we will update this as we get more information.
>
> Testing :
> GCC bootstrap and gcc regression passes on x86_64-pc-linux-gnu.
> GCC bootstrap passed with  "make BOOT_CFLAGS= -O2 -g -march=znver1 -mno-adx -mno-mwaitx -mno-clzero -mno-sha -mno-clflushopt -mno-rdseed" on x86_64-pc-linux-gnu .
>
> Built SPEC2006 benchmarks with -march=znver1 and ran it on bdver4 machine.
>
> Wrf and Calculix failed to compile but looks like a general register allocation issue not restricted to -march=znver1.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67717
>
> ChangeLog:
>                 * config.gcc (i[34567]86-*-linux* | ...): Add znver1.
>                 (case ${target}): Add znver1.
>                 * config/i386/cpuid.h(bit_CLZERO):  Define.
>                 * config/i386/driver-i386.c: (host_detect_local_cpu): Let
>                 -march=native recognize znver1 processors.
>                 * config/i386/i386-c.c (ix86_target_macros_internal): Add
>                 znver1, clzero def_and_undef.
>                 * config/i386/i386.c (struct processor_costs znver1_cost): New.
>                 (m_znver1): New definition.
>                 (m_AMD_MULTIPLE): Includes m_znver1.
>                 (processor_target_table): Add znver1 entry.
>                 (ix86_target_string) : Add clzero entry.
>                 (static const char *const cpu_names): Add znver1 entry.
>                 (ix86_option_override_internal): Add znver1 instruction sets.
>                 (PTA_CLZERO) :  New definition.
>                 (ix86_option_override_internal): Handle new clzerooption.
>                 (ix86_issue_rate): Add znver1.
>                 (ix86_adjust_cost): Add znver1.
>                 (get_builtin_code_for_version): Set priority for PROCESSOR_ZNVER1.
>                 (ia32_multipass_dfa_lookahead): Add znver1.
>                 (enum processor_model): Add M_AMDFAM17H_znver1.
>                 (struct _arch_names_table): Add M_AMDFAM17H_znver1.
>                 (has_dispatch): Add znver1.
>                 * config/i386/i386.h (TARGET_znver1): New definition.
>                 (TARGET_CLZERO): Define.
>                 (TARGET_CLZERO_P): Define.
>                 (struct ix86_size_cost): Add TARGET_ZNVER1.
>                 (enum processor_type): Add PROCESSOR_znver1.
>                 * config/i386/i386.md (define_attr "cpu"): Add znver1.
>                 (set_attr znver1_decode): New definitions for znver1.
>                 * config/i386/i386.opt (flag_dispatch_scheduler): Add znver1.
>                 (mclzero): New.
>                 * config/i386/mmx.md (set_attr znver1_decode): New definitions
>                 for znver1.
>                 * config/i386/sse.md (set_attr znver1_decode): Likewise.
>                 * config/i386/x86-tune.def:  Add znver1 tunings.
>                 * config/i386/znver1.md: Introduce znver1 cpu and include new md file.
>                 * gcc/doc/extend.texi: Add details about znver1.
>                 * gcc/doc/invoke.texi: Add details about znver1.
>
> Ok for trunk?

Please remove changes to get_builtin_code_for_version and
fold_builtin_cpu. They should be committed in a future patch, together
with relevant libgcc changes to libgcc/config/i386/cpuinfo.c.

+++ b/gcc/config.gcc
@@ -592,7 +592,7 @@ pentium4 pentium4m pentiumpro prescott iamcu"
 # 64-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
 x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
-bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
+bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona znver1 \
 core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \

Please group the new processor with other AMD processors.

+++ b/gcc/config/i386/i386-c.c

     | PTA_MOVBE | PTA_MWAITX},
-      {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
+     {"znver1", PROCESSOR_ZNVER1, CPU_ZNVER1,

wrong indetation.

+++ b/gcc/config/i386/driver-i386.c
@@ -414,6 +414,7 @@ const char *host_detect_local_cpu (int argc, const
char **argv)
   unsigned int has_avx512dq = 0, has_avx512bw = 0, has_avx512vl = 0;
   unsigned int has_avx512vbmi = 0, has_avx512ifma = 0, has_clwb = 0;
   unsigned int has_pcommit = 0, has_mwaitx = 0;
+  unsigned int has_clzero = 0;

This option should be added to the options string. Please see the end
of driver-i386.c on how other options are precessed.

+++ b/gcc/config/i386/i386.opt
@@ -569,8 +569,8 @@ the function.

 mdispatch-scheduler
 Target RejectNegative Var(flag_dispatch_scheduler)
-Do dispatch scheduling if processor is bdver1 or bdver2 or bdver3 or
bdver4 and Haifa scheduling
-is selected.
+Do dispatch scheduling if processor is bdver1 or bdver2 or bdver3 or bdver4
+or znver1 and Haifa scheduling is selected.

Please reword the above with commas: ... if processor is bdver1,
bdver2, bdver3, bdver4 or znver1 ...

+++ b/gcc/config/i386/x86-tune.def
@@ -59,7 +59,7 @@ DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY,
"partial_reg_dependency",
    that can be partly masked by careful scheduling of moves.  */
 DEF_TUNE (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY, "sse_partial_reg_dependency",
           m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_AMDFAM10
-      | m_BDVER | m_GENERIC)
+      | m_BDVER | m_GENERIC | m_ZNVER1)

[...]

Please leave m_GENERIC as the last entry. Also, please group AMD masks
together in some consistent way throughout the x86-tune.def file.

+++ b/gcc/config/i386/znver1.md

+;; SSE converts
+(define_insn_reservation "znver1_ssecvtsf_si_load" 9

"SSE conversions" ?

+++ b/gcc/doc/extend.texi
@@ -16924,6 +16924,9 @@ AMD Family 15h Bulldozer version 3.
 @item bdver4
 AMD Family 15h Bulldozer version 4.

+@item znver1
+AMD Family 17h Zen version 1.
+
 @item btver2
 AMD Family 16h CPU.
 @end table

Please remove the not yet implemented change above.

Otherwise, the patch looks OK. Please repost it with the above changes
for the final approval.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]