[PATCH, GCC/ARM] Rewire -mfpu=fp-armv8 as VFPv5 + D32 + DP
Thomas Preudhomme
thomas.preudhomme@foss.arm.com
Fri Jul 14 15:29:00 GMT 2017
Hi Richard,
I've committed the requested change as a separate patch to make it easier to
backport to earlier GCC versions.
Definition of __ARM_FEATURE_NUMERIC_MAXMIN checks for
TARGET_ARM_ARCH >= 8 and TARGET_NEON being true in addition to
TARGET_VFP5. However, instructions covered by this macro are part of
FPv5 which is available in ARMv7E-M architecture. This patch fixes the
macro to only check for TARGET_VFP5.
ChangeLog entry is as follows:
*** gcc/ChangeLog ***
* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_NUMERIC_MAXMIN solely based on TARGET_VFP5.
Built and confirmed that the macro is now defined when building with
-march=armv7e-m+fpv5 -mfloat-abi=hard.
Best regards,
Thomas
On 14/07/17 15:43, Richard Earnshaw (lists) wrote:
> On 14/07/17 09:20, Thomas Preudhomme wrote:
>> Hi,
>>
>> fp-armv8 is currently defined as a double precision FPv5 with 32 D
>> registers *and* a special FP_ARMv8 bit. However FP for ARMv8 should only
>> bring 32 D registers on top of FPv5-D16 so this FP_ARMv8 bit is
>> spurious. As a consequence, many instruction patterns which are guarded
>> by TARGET_FPU_ARMV8 are unavailable to FPv5-D16 and FPv5-SP-D16.
>>
>> This patch gets rid of TARGET_FPU_ARMV8 and rewire all uses to
>> expressions based on TARGET_VFP5, TARGET_VFPD32 and TARGET_VFP_DOUBLE.
>> It also redefine ISA_FP_ARMv8 to include the D32 capability to
>> distinguish it from FPv5-D16. At last, it sets the +fp.sp for ARMv8-R to
>> enable FPv5-SP-D16 (ie FP for ARMv8 with single precision only and 16 D
>> registers).
>>
>> ChangeLog entry is as follows:
>>
>> 2017-07-07 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>
>> * config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator.
>> (ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32.
>> * config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5.
>> (fp-armv8): Define it as FP_ARMv8 only.
>> config/arm/arm.h (TARGET_FPU_ARMV8): Delete.
>> (TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than
>> TARGET_FPU_ARMV8.
>> config/arm/arm.c (arm_rtx_costs_internal): Replace checks against
>> TARGET_FPU_ARMV8 by checks against TARGET_VFP5.
>> * config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define
>> first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather
>> than TARGET_FPU_ARMV8.
>> * config/arm/arm-c.c (arm_cpu_builtins): Likewise for
>> __ARM_FEATURE_NUMERIC_MAXMIN macro definition.
>> * config/arm/arm.md (cmov<mode>): Condition on TARGET_VFP5 rather than
>> TARGET_FPU_ARMV8.
>> * config/arm/neon.md (neon_vrint): Likewise.
>> (neon_vcvt): Likewise.
>> (neon_<fmaxmin_op><mode>): Likewise.
>> (<fmaxmin><mode>3): Likewise.
>> * config/arm/vfp.md (l<vrint_pattern><su_optab><mode>si2): Likewise.
>> * config/arm/predicates.md (arm_cond_move_operator): Check against
>> TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing.
>>
>> Testing:
>> * Bootstrapped under ARMv8-A Thumb state and ran testsuite -> no
>> regression
>> * built Spec2000 and Spec2006 with -march=armv8-a+fp16 and compared
>> objdump -> no code generation difference
>>
>> Is this ok for trunk?
>
> OK with changes mentioned below.
>
> R.
>
>>
>> Best regards,
>>
>> Thomas
>>
>> rewire_mfpu_fparmv8.patch
>>
>>
>> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
>> index 63ee880822c17eda55dd58438d61cbbba333b2c6..7504ed581c63a657a0dff48442633704bd252b2e 100644
>> --- a/gcc/config/arm/arm-builtins.c
>> +++ b/gcc/config/arm/arm-builtins.c
>> @@ -3098,7 +3098,7 @@ arm_builtin_vectorized_function (unsigned int fn, tree type_out, tree type_in)
>> NULL_TREE is returned if no such builtin is available. */
>> #undef ARM_CHECK_BUILTIN_MODE
>> #define ARM_CHECK_BUILTIN_MODE(C) \
>> - (TARGET_FPU_ARMV8 \
>> + (TARGET_VFP5 \
>> && flag_unsafe_math_optimizations \
>> && ARM_CHECK_BUILTIN_MODE_1 (C))
>>
>> diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
>> index a3daa3220a2bc4220dffdb7ca08ca9419bdac425..9178937b6d9e0fe5d0948701390c4cf01f4f8c7d 100644
>> --- a/gcc/config/arm/arm-c.c
>> +++ b/gcc/config/arm/arm-c.c
>> @@ -96,7 +96,7 @@ arm_cpu_builtins (struct cpp_reader* pfile)
>> || TARGET_ARM_ARCH_ISA_THUMB >=2));
>>
>> def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN",
>> - TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8);
>> + TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_VFP5);
>
> This looks wrong (though ACLE is misleading). The MAXMIN property is
> solely defined by having an FPv5 capable FPU.
>
>>
>> def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD);
>>
>> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
>> index f35128acb7d68c6a0592355b9d3d56ee8f826aca..e2ff297aed7514073dbb3bf5ee86964f202e5a14 100644
>> --- a/gcc/config/arm/arm-cpus.in
>> +++ b/gcc/config/arm/arm-cpus.in
>> @@ -389,7 +389,7 @@ begin arch armv8-r
>> option crc add bit_crc32
>> # fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision
> Please update comment
>> # note: no fp option for fp-armv8 (d16) + double precision at the moment
>> - option fp.sp add FP_ARMv8
>> + option fp.sp add FPv5
>> option simd add FP_ARMv8 NEON
>> option crypto add FP_ARMv8 CRYPTO
>> option nocrypto remove ALL_CRYPTO
>> @@ -1390,7 +1390,7 @@ begin fpu fpv5-d16
>> end fpu fpv5-d16
>>
>> begin fpu fp-armv8
>> - isa FP_ARMv8 FP_D32
>> + isa FP_ARMv8
>> end fpu fp-armv8
>>
>> begin fpu neon-fp-armv8
>> diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h
>> index 0d66a0400c517668db023fc66ff43e26d43add51..dbd29eaa52f2007498c2aff6263b8b6c3a70e2c2 100644
>> --- a/gcc/config/arm/arm-isa.h
>> +++ b/gcc/config/arm/arm-isa.h
>> @@ -60,7 +60,6 @@ enum isa_feature
>> isa_bit_VFPv4, /* Vector floating point v4. */
>> isa_bit_FPv5, /* Floating point v5. */
>> isa_bit_lpae, /* ARMv7-A LPAE. */
>> - isa_bit_FP_ARMv8, /* ARMv8 floating-point extension. */
>> isa_bit_neon, /* Advanced SIMD instructions. */
>> isa_bit_fp16conv, /* Conversions to/from fp16 (VFPv3 extension). */
>> isa_bit_fp_dbl, /* Double precision operations supported. */
>> @@ -143,7 +142,7 @@ enum isa_feature
>> default. isa_bit_fp16 is deliberately missing from this list. */
>> #define ISA_ALL_FPU_INTERNAL \
>> isa_bit_VFPv2, isa_bit_VFPv3, isa_bit_VFPv4, isa_bit_FPv5, \
>> - isa_bit_FP_ARMv8, isa_bit_fp16conv, isa_bit_fp_dbl, ISA_ALL_SIMD
>> + isa_bit_fp16conv, isa_bit_fp_dbl, ISA_ALL_SIMD
>>
>> /* Similarly, but including fp16 and other extensions that aren't part of
>> -mfpu support. */
>> @@ -154,10 +153,10 @@ enum isa_feature
>> #define ISA_VFPv3 ISA_VFPv2, isa_bit_VFPv3
>> #define ISA_VFPv4 ISA_VFPv3, isa_bit_VFPv4, isa_bit_fp16conv
>> #define ISA_FPv5 ISA_VFPv4, isa_bit_FPv5
>> -#define ISA_FP_ARMv8 ISA_FPv5, isa_bit_FP_ARMv8
>>
>> #define ISA_FP_DBL isa_bit_fp_dbl
>> #define ISA_FP_D32 ISA_FP_DBL, isa_bit_fp_d32
>> +#define ISA_FP_ARMv8 ISA_FPv5, ISA_FP_D32
>> #define ISA_NEON ISA_FP_D32, isa_bit_neon
>> #define ISA_CRYPTO ISA_NEON, isa_bit_crypto
>>
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index 315622212a5ce10d0c771535fe31f63c3be16444..4f53583cf0219de4329bc64a47a5a42c550ff354 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -196,10 +196,6 @@ extern tree arm_fp16_type_node;
>> /* FPU supports fused-multiply-add operations. */
>> #define TARGET_FMA (bitmap_bit_p (arm_active_target.isa, isa_bit_VFPv4))
>>
>> -/* FPU is ARMv8 compatible. */
>> -#define TARGET_FPU_ARMV8 \
>> - (bitmap_bit_p (arm_active_target.isa, isa_bit_FP_ARMv8))
>> -
>> /* FPU supports Crypto extensions. */
>> #define TARGET_CRYPTO (bitmap_bit_p (arm_active_target.isa, isa_bit_crypto))
>>
>> @@ -216,7 +212,7 @@ extern tree arm_fp16_type_node;
>>
>> /* FPU supports the floating point FP16 instructions for ARMv8.2 and later. */
>> #define TARGET_VFP_FP16INST \
>> - (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 && arm_fp16_inst)
>> + (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP5 && arm_fp16_inst)
>>
>> /* FPU supports the AdvSIMD FP16 instructions for ARMv8.2 and later. */
>> #define TARGET_NEON_FP16INST (TARGET_VFP_FP16INST && TARGET_NEON_RDMA)
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index c6101efd555996a4c6db5eaea0130b0940c4cff8..f59132c3f079d10d9e3d920b61037db2f3144eee 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -10755,7 +10755,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
>> {
>> if (speed_p)
>> *cost += extra_cost->fp[mode == DFmode].widen;
>> - if (!TARGET_FPU_ARMV8
>> + if (!TARGET_VFP5
>> && GET_MODE (XEXP (x, 0)) == HFmode)
>> {
>> /* Pre v8, widening HF->DF is a two-step process, first
>> @@ -10849,7 +10849,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
>> return true;
>> }
>> else if (GET_MODE_CLASS (mode) == MODE_FLOAT
>> - && TARGET_FPU_ARMV8)
>> + && TARGET_VFP5)
>> {
>> if (speed_p)
>> *cost += extra_cost->fp[mode == DFmode].roundint;
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index e6e1ac54a850c35807d683804f5294fbef1487ad..049a78edefe9f85c6f84a4ecf0158d559e1d5674 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -7879,7 +7879,7 @@
>> "<F_constraint>")
>> (match_operand:SDF 4 "s_register_operand"
>> "<F_constraint>")))]
>> - "TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 <vfp_double_cond>"
>> + "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
>> "*
>> {
>> enum arm_cond_code code = maybe_get_arm_condition_code (operands[1]);
>> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
>> index 33b25ff3c730544b4376bf318400d703c8813a0a..235c46da1a19712e2924d748545474ed991d9f92 100644
>> --- a/gcc/config/arm/neon.md
>> +++ b/gcc/config/arm/neon.md
>> @@ -751,7 +751,7 @@
>> (unspec:VCVTF [(match_operand:VCVTF 1
>> "s_register_operand" "w")]
>> NEON_VRINT))]
>> - "TARGET_NEON && TARGET_FPU_ARMV8"
>> + "TARGET_NEON && TARGET_VFP5"
>> "vrint<nvrint_variant>.f32\\t%<V_reg>0, %<V_reg>1"
>> [(set_attr "type" "neon_fp_round_<V_elem_ch><q>")]
>> )
>> @@ -761,7 +761,7 @@
>> (FIXUORS:<V_cmp_result> (unspec:VCVTF
>> [(match_operand:VCVTF 1 "register_operand" "w")]
>> NEON_VCVT)))]
>> - "TARGET_NEON && TARGET_FPU_ARMV8"
>> + "TARGET_NEON && TARGET_VFP5"
>> "vcvt<nvrint_variant>.<su>32.f32\\t%<V_reg>0, %<V_reg>1"
>> [(set_attr "type" "neon_fp_to_int_<V_elem_ch><q>")
>> (set_attr "predicable" "no")]
>> @@ -2901,7 +2901,7 @@
>> (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
>> (match_operand:VCVTF 2 "s_register_operand" "w")]
>> VMAXMINFNM))]
>> - "TARGET_NEON && TARGET_FPU_ARMV8"
>> + "TARGET_NEON && TARGET_VFP5"
>> "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
>> [(set_attr "type" "neon_fp_minmax_s<q>")]
>> )
>> @@ -2912,7 +2912,7 @@
>> (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
>> (match_operand:VCVTF 2 "s_register_operand" "w")]
>> VMAXMINFNM))]
>> - "TARGET_NEON && TARGET_FPU_ARMV8"
>> + "TARGET_NEON && TARGET_VFP5"
>> "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
>> [(set_attr "type" "neon_fp_minmax_s<q>")]
>> )
>> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
>> index afb5d6339a8af362384c93bbb46928635073b74b..3e25cd16b29231d53b4cadce3db0fbb3168cd4c5 100644
>> --- a/gcc/config/arm/predicates.md
>> +++ b/gcc/config/arm/predicates.md
>> @@ -350,9 +350,9 @@
>>
>> (define_special_predicate "arm_cond_move_operator"
>> (if_then_else (match_test "arm_restrict_it")
>> - (and (match_test "TARGET_FPU_ARMV8")
>> - (match_operand 0 "arm_vsel_comparison_operator"))
>> - (match_operand 0 "expandable_comparison_operator")))
>> + (and (match_test "TARGET_VFP5")
>> + (match_operand 0 "arm_vsel_comparison_operator"))
>> + (match_operand 0 "expandable_comparison_operator")))
>>
>> (define_special_predicate "noov_comparison_operator"
>> (match_code "lt,ge,eq,ne"))
>> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
>> index d8f77e2ffe4fdb7c952d6a5ac947d91f89ce259d..23c1d67c9e3707e64a4e206dc62727e4c79ba89c 100644
>> --- a/gcc/config/arm/vfp.md
>> +++ b/gcc/config/arm/vfp.md
>> @@ -1997,7 +1997,7 @@
>> (FIXUORS:SI (unspec:SDF
>> [(match_operand:SDF 1
>> "register_operand" "<F_constraint>")] VCVT)))]
>> - "TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 <vfp_double_cond>"
>> + "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>"
>> "vcvt<vrint_variant>.<su>32.<V_if_elem>\\t%0, %<V_reg>1"
>> [(set_attr "predicable" "no")
>> (set_attr "conds" "unconditional")
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix__ARM_FEATURE_NUMERIC_MAXMIN.patch
Type: text/x-patch
Size: 548 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170714/c6446e5c/attachment.bin>
More information about the Gcc-patches
mailing list