[PATCH] arm: Add new vector mode macros
Richard Sandiford
richard.sandiford@arm.com
Tue Sep 8 13:44:48 GMT 2020
[ This is related to Dennis's subtraction patch
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553339.html
and the discussion about how the patterns were written. I wanted
to see whether there was a way that we could simplify the current
addition handling that might perhaps make it easier to add other
MVE operations in future. It seemed like one of those situations
in which the most productive thing would be to try it and see,
rather than just describe it in words.
One of the questions Ramana had in the thread above was: why does
MVE not need the flag_unsafe_math_optimizations flag? AIUI the reason
is that MVE honours the FPSCR.FZ flag while SF Advanced SIMD always
flushes to zero. (HF Advanced SIMD honours FPSCR.FZ16 and so also
doesn't need flag_unsafe_math_optimizations.) ]
The AArch32 port now has three vector extensions: iwMMXt, Neon
and MVE. We already have some named expanders that are shared
by all three, and soon we'll need more.
One way of handling this would be to use define_mode_iterators
that specify the condition for each mode. For example,
(V16QI "TARGET_NEON || TARGET_HAVE_MVE")
(V8QI "TARGET_NEON || TARGET_REALLY_IWMXXT")
...
(V2SF "TARGET_NEON && flag_unsafe_math_optimizations")
etc. However, we'll need several mode iterators, and it would
be repetitive to specify the mode condition every time.
This patch therefore introduces per-mode macros that say whether
we can perform general arithmetic on the mode. Initially there are
two sets of macros:
ARM_HAVE_NEON_<MODE>_ARITH
true if Neon can handle general arithmetic on <MODE>
ARM_HAVE_<MODE>_ARITH
true if any vector extension can handle general arithmetic on <MODE>
The macro definitions themselves are undeniably ugly, but hopefully
they're justified by the simplifications they allow.
The patch converts the addition patterns to use this scheme.
Previously there were three copies of the V8HF and V4HF addition
patterns for Neon:
(1) *add<VDQ:mode>3_neon, which provided plus:VnHF even without
TARGET_NEON_FP16INST. This was probably harmless since all the
named patterns had an appropriate guard, but it is possible that
something could have tried to generate the plus directly, such as
by using a REG_EQUAL note to generate a new pattern.
(2) addv8hf3_neon and addv4hf3, which had the correct
TARGET_NEON_FP16INST target condition, but unnecessarily required
flag_unsafe_math_optimizations. Unlike VnSF operations, VnHF
operations do not force flush to zero.
(3) add<VH:mode>3_fp16, which provided plus:VnHF with the
correct conditions (TARGET_NEON_FP16INST, with no
flag_unsafe_math_optimizations test).
The patch in essence renames add<VH:mode>3_fp16 to *add<VH:mode>3_neon
(part of *add<VDQ:mode>3_neon) and removes the other two patterns.
WDYT? Does this look like a way forward?
Tested on arm-linux-gnueabihf and armeb-eabi.
Thanks,
Richard
gcc/
* config/arm/arm.h (ARM_HAVE_NEON_V8QI_ARITH, ARM_HAVE_NEON_V4HI_ARITH)
(ARM_HAVE_NEON_V2SI_ARITH, ARM_HAVE_NEON_V16QI_ARITH): New macros.
(ARM_HAVE_NEON_V8HI_ARITH, ARM_HAVE_NEON_V4SI_ARITH): Likewise.
(ARM_HAVE_NEON_V2DI_ARITH, ARM_HAVE_NEON_V4HF_ARITH): Likewise.
(ARM_HAVE_NEON_V8HF_ARITH, ARM_HAVE_NEON_V2SF_ARITH): Likewise.
(ARM_HAVE_NEON_V4SF_ARITH, ARM_HAVE_V8QI_ARITH, ARM_HAVE_V4HI_ARITH)
(ARM_HAVE_V2SI_ARITH, ARM_HAVE_V16QI_ARITH, ARM_HAVE_V8HI_ARITH)
(ARM_HAVE_V4SI_ARITH, ARM_HAVE_V2DI_ARITH, ARM_HAVE_V4HF_ARITH)
(ARM_HAVE_V2SF_ARITH, ARM_HAVE_V8HF_ARITH, ARM_HAVE_V4SF_ARITH):
Likewise.
* config/arm/iterators.md (VNIM, VNINOTM): Delete.
* config/arm/vec-common.md (add<VNIM:mode>3, addv8hf3)
(add<VNINOTM:mode>3): Replace with...
(add<VDQ:mode>3): ...this new expander.
* config/arm/neon.md (*add<VDQ:mode>3_neon): Use the new
ARM_HAVE_NEON_<MODE>_ARITH macros as the C condition.
(addv8hf3_neon, addv4hf3, add<VFH:mode>3_fp16): Delete in
favor of the above.
(neon_vadd<VH:mode>): Use gen_add<mode>3 instead of
gen_add<mode>3_fp16.
---
gcc/config/arm/arm.h | 41 +++++++++++++++++++++++++++++++
gcc/config/arm/iterators.md | 8 ------
gcc/config/arm/neon.md | 47 ++----------------------------------
gcc/config/arm/vec-common.md | 42 ++++----------------------------
4 files changed, 48 insertions(+), 90 deletions(-)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 3887c51eebe..3284ae29d7c 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1106,6 +1106,47 @@ extern const int arm_arch_cde_coproc_bits[];
#define VALID_MVE_STRUCT_MODE(MODE) \
((MODE) == TImode || (MODE) == OImode || (MODE) == XImode)
+/* The conditions under which vector modes are supported for general
+ arithmetic using Neon. */
+
+#define ARM_HAVE_NEON_V8QI_ARITH TARGET_NEON
+#define ARM_HAVE_NEON_V4HI_ARITH TARGET_NEON
+#define ARM_HAVE_NEON_V2SI_ARITH TARGET_NEON
+
+#define ARM_HAVE_NEON_V16QI_ARITH TARGET_NEON
+#define ARM_HAVE_NEON_V8HI_ARITH TARGET_NEON
+#define ARM_HAVE_NEON_V4SI_ARITH TARGET_NEON
+#define ARM_HAVE_NEON_V2DI_ARITH TARGET_NEON
+
+/* HF operations have their own flush-to-zero control (FPSCR.FZ16). */
+#define ARM_HAVE_NEON_V4HF_ARITH TARGET_NEON_FP16INST
+#define ARM_HAVE_NEON_V8HF_ARITH TARGET_NEON_FP16INST
+
+/* SF operations always flush to zero, regardless of FPSCR.FZ, so we can
+ only use them for general arithmetic when -funsafe-math-optimizations
+ is in effect. */
+#define ARM_HAVE_NEON_V2SF_ARITH \
+ (TARGET_NEON && flag_unsafe_math_optimizations)
+#define ARM_HAVE_NEON_V4SF_ARITH ARM_HAVE_NEON_V2SF_ARITH
+
+/* The conditions under which vector modes are supported for general
+ arithmetic by any vector extension. */
+
+#define ARM_HAVE_V8QI_ARITH (ARM_HAVE_NEON_V8QI_ARITH || TARGET_REALLY_IWMMXT)
+#define ARM_HAVE_V4HI_ARITH (ARM_HAVE_NEON_V4HI_ARITH || TARGET_REALLY_IWMMXT)
+#define ARM_HAVE_V2SI_ARITH (ARM_HAVE_NEON_V2SI_ARITH || TARGET_REALLY_IWMMXT)
+
+#define ARM_HAVE_V16QI_ARITH (ARM_HAVE_NEON_V16QI_ARITH || TARGET_HAVE_MVE)
+#define ARM_HAVE_V8HI_ARITH (ARM_HAVE_NEON_V8HI_ARITH || TARGET_HAVE_MVE)
+#define ARM_HAVE_V4SI_ARITH (ARM_HAVE_NEON_V4SI_ARITH || TARGET_HAVE_MVE)
+#define ARM_HAVE_V2DI_ARITH ARM_HAVE_NEON_V2DI_ARITH
+
+#define ARM_HAVE_V4HF_ARITH ARM_HAVE_NEON_V4HF_ARITH
+#define ARM_HAVE_V2SF_ARITH ARM_HAVE_NEON_V2SF_ARITH
+
+#define ARM_HAVE_V8HF_ARITH (ARM_HAVE_NEON_V8HF_ARITH || TARGET_HAVE_MVE_FLOAT)
+#define ARM_HAVE_V4SF_ARITH (ARM_HAVE_NEON_V4SF_ARITH || TARGET_HAVE_MVE_FLOAT)
+
/* The register numbers in sequence, for passing to arm_gen_load_multiple. */
extern int arm_regs_in_sequence[];
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 0bc9eba0722..c70e3bc2731 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -66,14 +66,6 @@ (define_mode_iterator VSHFT [V4HI V2SI DI])
;; Integer and float modes supported by Neon and IWMMXT.
(define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
-;; Integer and float modes supported by Neon, IWMMXT and MVE, used by
-;; arithmetic epxand patterns.
-(define_mode_iterator VNIM [V16QI V8HI V4SI V4SF])
-
-;; Integer and float modes supported by Neon and IWMMXT but not MVE, used by
-;; arithmetic epxand patterns.
-(define_mode_iterator VNINOTM [V2SI V4HI V8QI V2SF V2DI])
-
;; Integer and float modes supported by Neon, IWMMXT and MVE.
(define_mode_iterator VNIM1 [V16QI V8HI V4SI V4SF V2DI])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 3e7b51d8ab6..96bf277f501 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -501,7 +501,7 @@ (define_insn "*add<mode>3_neon"
[(set (match_operand:VDQ 0 "s_register_operand" "=w")
(plus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
(match_operand:VDQ 2 "s_register_operand" "w")))]
- "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
+ "ARM_HAVE_NEON_<MODE>_ARITH"
"vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
[(set (attr "type")
(if_then_else (match_test "<Is_float_mode>")
@@ -509,49 +509,6 @@ (define_insn "*add<mode>3_neon"
(const_string "neon_add<q>")))]
)
-;; As with SFmode, full support for HFmode vector arithmetic is only available
-;; when flag-unsafe-math-optimizations is enabled.
-
-;; Add pattern with modes V8HF and V4HF is split into separate patterns to add
-;; support for standard pattern addv8hf3 in MVE. Following pattern is called
-;; from "addv8hf3" standard pattern inside vec-common.md file.
-
-(define_insn "addv8hf3_neon"
- [(set
- (match_operand:V8HF 0 "s_register_operand" "=w")
- (plus:V8HF
- (match_operand:V8HF 1 "s_register_operand" "w")
- (match_operand:V8HF 2 "s_register_operand" "w")))]
- "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations"
- "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
- [(set_attr "type" "neon_fp_addsub_s_q")]
-)
-
-(define_insn "addv4hf3"
- [(set
- (match_operand:V4HF 0 "s_register_operand" "=w")
- (plus:V4HF
- (match_operand:V4HF 1 "s_register_operand" "w")
- (match_operand:V4HF 2 "s_register_operand" "w")))]
- "TARGET_NEON_FP16INST && flag_unsafe_math_optimizations"
- "vadd.f16\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
- [(set_attr "type" "neon_fp_addsub_s_q")]
-)
-
-(define_insn "add<mode>3_fp16"
- [(set
- (match_operand:VH 0 "s_register_operand" "=w")
- (plus:VH
- (match_operand:VH 1 "s_register_operand" "w")
- (match_operand:VH 2 "s_register_operand" "w")))]
- "TARGET_NEON_FP16INST"
- "vadd.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
- [(set (attr "type")
- (if_then_else (match_test "<Is_float_mode>")
- (const_string "neon_fp_addsub_s<q>")
- (const_string "neon_add<q>")))]
-)
-
(define_insn "*sub<mode>3_neon"
[(set (match_operand:VDQ 0 "s_register_operand" "=w")
(minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
@@ -1837,7 +1794,7 @@ (define_expand "neon_vadd<mode>"
(match_operand:VH 2 "s_register_operand")]
"TARGET_NEON_FP16INST"
{
- emit_insn (gen_add<mode>3_fp16 (operands[0], operands[1], operands[2]));
+ emit_insn (gen_add<mode>3 (operands[0], operands[1], operands[2]));
DONE;
})
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index b7e3619caf4..c3c86c46355 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -81,43 +81,11 @@ (define_expand "movv8hf"
;; patterns separately for Neon, IWMMXT and MVE.
(define_expand "add<mode>3"
- [(set (match_operand:VNIM 0 "s_register_operand")
- (plus:VNIM (match_operand:VNIM 1 "s_register_operand")
- (match_operand:VNIM 2 "s_register_operand")))]
- "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != V4SFmode)
- || flag_unsafe_math_optimizations))
- || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))
- || (TARGET_HAVE_MVE && VALID_MVE_SI_MODE(<MODE>mode))
- || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE(<MODE>mode))"
-{
-})
-
-;; Vector arithmetic. Expanders are blank, then unnamed insns implement
-;; patterns separately for Neon and MVE.
-
-(define_expand "addv8hf3"
- [(set (match_operand:V8HF 0 "s_register_operand")
- (plus:V8HF (match_operand:V8HF 1 "s_register_operand")
- (match_operand:V8HF 2 "s_register_operand")))]
- "(TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE(V8HFmode))
- || (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations)"
-{
- if (TARGET_NEON_FP16INST && flag_unsafe_math_optimizations)
- emit_insn (gen_addv8hf3_neon (operands[0], operands[1], operands[2]));
-})
-
-;; Vector arithmetic. Expanders are blank, then unnamed insns implement
-;; patterns separately for Neon and IWMMXT.
-
-(define_expand "add<mode>3"
- [(set (match_operand:VNINOTM 0 "s_register_operand")
- (plus:VNINOTM (match_operand:VNINOTM 1 "s_register_operand")
- (match_operand:VNINOTM 2 "s_register_operand")))]
- "(TARGET_NEON && ((<MODE>mode != V2SFmode && <MODE>mode != V4SFmode)
- || flag_unsafe_math_optimizations))
- || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
-{
-})
+ [(set (match_operand:VDQ 0 "s_register_operand")
+ (plus:VDQ (match_operand:VDQ 1 "s_register_operand")
+ (match_operand:VDQ 2 "s_register_operand")))]
+ "ARM_HAVE_<MODE>_ARITH"
+)
;; Vector arithmetic. Expanders are blank, then unnamed insns implement
;; patterns separately for IWMMXT and Neon.
--
2.17.1
More information about the Gcc-patches
mailing list