[PATCH,i386] fma4 addition for bdver2
Gopalasubramanian, Ganesh
Ganesh.Gopalasubramanian@amd.com
Mon Sep 10 05:23:00 GMT 2012
Hi,
The second change (done in config/i386/driver-i386.c (host_detect_local_cpu)) is not reflected in svn revision 191109.
Since we are handling the fma instruction selection in i386.c\i386.md, we need not disable the flag in driver.
Let me know your opinion.
Regards
Ganesh
-----Original Message-----
From: Gopalasubramanian, Ganesh
Sent: Wednesday, September 05, 2012 3:41 PM
To: gcc-patches@gcc.gnu.org
Cc: Uros Bizjak (ubizjak@gmail.com)
Subject: [PATCH,i386] fma4 addition for bdver2
Hello,
FMA4 and FMA3 ISA are implemented in bdver2 target.
FMA3 is selected by default.
This patch supports the use of FMA4 intrinsics for bdver2 targets.
Is it OK for trunk?
Regards
Ganesh
2012-09-05 Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com>
* config/i386/i386.md : Comments on fma4 instruction
selection reflect requirement on register pressure based
cost model.
* config/i386/driver-i386.c (host_detect_local_cpu): fma4
flag is set-reset as informed by the cpuid flag.
* config/i386/i386.c (processor_alias_table): fma4
flag is enabled for bdver2.
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md (revision 190830)
+++ gcc/config/i386/i386.md (working copy)
@@ -659,9 +659,11 @@
(eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2")
(eq_attr "isa" "bmi2") (symbol_ref "TARGET_BMI2")
(eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
- ;; Disable generation of FMA4 instructions for generic code
- ;; since FMA3 is preferred for targets that implement both
- ;; instruction sets.
+ ;; Fma instruction selection has to be done based on
+ ;; register pressure. For generating fma4, a cost model
+ ;; based on register pressure is required. Till then,
+ ;; fma4 instruction is disabled for targets that implement
+ ;; both fma and fma4 instruction sets.
(eq_attr "isa" "fma4")
(symbol_ref "TARGET_FMA4 && !TARGET_FMA")
]
Index: gcc/config/i386/driver-i386.c
===================================================================
--- gcc/config/i386/driver-i386.c (revision 190830)
+++ gcc/config/i386/driver-i386.c (working copy)
@@ -483,8 +483,6 @@
has_abm = ecx & bit_ABM;
has_lwp = ecx & bit_LWP;
has_fma4 = ecx & bit_FMA4;
- if (vendor == SIG_AMD && has_fma4 && has_fma)
- has_fma4 = 0;
has_xop = ecx & bit_XOP;
has_tbm = ecx & bit_TBM;
has_lzcnt = ecx & bit_LZCNT;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 190830)
+++ gcc/config/i386/i386.c (working copy)
@@ -3164,7 +3164,7 @@
{"bdver2", PROCESSOR_BDVER2, CPU_BDVER2,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
- | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
+ | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
| PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
| PTA_FMA},
{"btver1", PROCESSOR_BTVER1, CPU_GENERIC64,
Regards
Ganesh
More information about the Gcc-patches
mailing list