Add a compatible_vector_types_p target hook

Richard Sandiford richard.sandiford@arm.com
Thu Dec 12 15:10:00 GMT 2019


One problem with adding an N-bit vector extension to an existing
architecture is to decide how N-bit vectors should be passed to
functions and returned from functions.  Allowing all N-bit vector
types to be passed in registers breaks backwards compatibility,
since N-bit vectors could be used (and emulated) before the vector
extension was added.  But always passing N-bit vectors on the
stack would be inefficient for things like vector libm functions.

For SVE we took the compromise position of predefining new SVE vector
types that are distinct from all existing vector types, including
GNU-style vectors.  The new types are passed and returned in an
efficient way while existing vector types are passed and returned
in the traditional way.  In the right circumstances, the two types
are inter-convertible.

The SVE types are created using:

      vectype = build_distinct_type_copy (vectype);
      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
      TYPE_ARTIFICIAL (vectype) = 1;

The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
to convert from one type to the other.  However, the distinction can
be lost during gimple, which treats two vector types with the same
mode, number of elements, and element type as equivalent.  And for
most targets that's the right thing to do.

This patch therefore adds a hook that lets the target choose
whether such vector types are indeed equivalent.

Note that the new tests fail for -mabi=ilp32 in the same way as other
ACLE-based tests.  I'm still planning to fix that as a follow-on.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-12-12  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (compatible_vector_types_p): New target hook.
	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
	* doc/tm.texi: Regenerate.
	* gimple-expr.c: Include target.h.
	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
	function.
	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
	* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
	Use the original predicate if it already has a suitable type.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.

Index: gcc/target.def
===================================================================
--- gcc/target.def	2019-11-30 18:48:18.531984101 +0000
+++ gcc/target.def	2019-12-12 15:07:43.960415368 +0000
@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
  hook_bool_mode_false)
 
 DEFHOOK
+(compatible_vector_types_p,
+ "Return true if there is no target-specific reason for treating\n\
+vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
+has already checked for target-independent reasons, meaning that the\n\
+types are known to have the same mode, to have the same number of elements,\n\
+and to have what the caller considers to be compatible element types.\n\
+\n\
+The main reason for defining this hook is to reject pairs of types\n\
+that are handled differently by the target's calling convention.\n\
+For example, when a new @var{N}-bit vector architecture is added\n\
+to a target, the target may want to handle normal @var{N}-bit\n\
+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
+before, to maintain backwards compatibility.  However, it may also\n\
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
+and returned in a more efficient way.  It is then important to maintain\n\
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
+architecture-specific ones.\n\
+\n\
+The default implementation returns true, which is correct for most targets.",
+ bool, (const_tree type1, const_tree type2),
+ hook_bool_const_tree_const_tree_true)
+
+DEFHOOK
 (vector_alignment,
  "This hook can be used to define the alignment for a vector of type\n\
 @var{type}, in order to comply with a platform ABI.  The default is to\n\
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h	2019-11-04 21:13:57.727755548 +0000
+++ gcc/hooks.h	2019-12-12 15:07:43.960415368 +0000
@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
+extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
 extern bool hook_bool_tree_true (tree);
 extern bool hook_bool_const_tree_true (const_tree);
 extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c	2019-11-04 21:13:57.727755548 +0000
+++ gcc/hooks.c	2019-12-12 15:07:43.960415368 +0000
@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
 }
 
 bool
+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
+{
+  return true;
+}
+
+bool
 hook_bool_tree_true (tree)
 {
   return true;
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	2019-11-30 18:48:18.523984157 +0000
+++ gcc/doc/tm.texi.in	2019-12-12 15:07:43.956415393 +0000
@@ -3365,6 +3365,8 @@ stack.
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
 
+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
+
 @hook TARGET_ARRAY_MODE
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2019-11-30 18:48:18.507984271 +0000
+++ gcc/doc/tm.texi	2019-12-12 15:07:43.952415419 +0000
@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
 must have move patterns for this mode.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
+Return true if there is no target-specific reason for treating
+vector types @var{type1} and @var{type2} as distinct types.  The caller
+has already checked for target-independent reasons, meaning that the
+types are known to have the same mode, to have the same number of elements,
+and to have what the caller considers to be compatible element types.
+
+The main reason for defining this hook is to reject pairs of types
+that are handled differently by the target's calling convention.
+For example, when a new @var{N}-bit vector architecture is added
+to a target, the target may want to handle normal @var{N}-bit
+@code{VECTOR_TYPE} arguments and return values in the same way as
+before, to maintain backwards compatibility.  However, it may also
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
+and returned in a more efficient way.  It is then important to maintain
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
+architecture-specific ones.
+
+The default implementation returns true, which is correct for most targets.
+@end deftypefn
+
 @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
 @var{nelems} elements, with each element having mode @var{mode}.
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c	2019-10-08 09:23:31.902529513 +0100
+++ gcc/gimple-expr.c	2019-12-12 15:07:43.956415393 +0000
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"
 
 /* ----- Type related -----  */
 
@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
 
   /* Recurse for vector types with the same number of subparts.  */
   else if (TREE_CODE (inner_type) == VECTOR_TYPE
-	   && TREE_CODE (outer_type) == VECTOR_TYPE
-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
-    return useless_type_conversion_p (TREE_TYPE (outer_type),
-				      TREE_TYPE (inner_type));
+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
+		      TYPE_VECTOR_SUBPARTS (outer_type))
+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
+					  TREE_TYPE (inner_type))
+	    && targetm.compatible_vector_types_p (inner_type, outer_type));
 
   else if (TREE_CODE (inner_type) == ARRAY_TYPE
 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2019-12-10 16:45:56.338226712 +0000
+++ gcc/config/aarch64/aarch64.c	2019-12-12 15:07:43.940415503 +0000
@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
   return default_function_abi;
 }
 
+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
+
+static bool
+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
+{
+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
+    return false;
+
+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
+  return true;
+}
+
 /* Return true if we should emit CFI for register REGNO.  */
 
 static bool
@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
 
+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
+#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
+
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   aarch64_builtin_support_vector_misalignment
Index: gcc/config/aarch64/aarch64-sve-builtins.cc
===================================================================
--- gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-06 18:22:12.072859530 +0000
+++ gcc/config/aarch64/aarch64-sve-builtins.cc	2019-12-12 15:07:43.936415528 +0000
@@ -2251,9 +2251,13 @@ tree
 gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
 			     unsigned int argno)
 {
-  tree predtype = truth_type_for (vectype);
   tree pred = gimple_call_arg (call, argno);
-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
+		TYPE_VECTOR_SUBPARTS (vectype)))
+    return pred;
+
+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
+		       truth_type_for (vectype), pred);
 }
 
 /* Return a pointer to the address in a contiguous load or store,
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c	2019-12-12 15:07:43.972415287 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (float16x16_t);
+void float32_callee (float32x8_t);
+void float64_callee (float64x4_t);
+void int8_callee (int8x32_t);
+void int16_callee (int16x16_t);
+void int32_callee (int32x8_t);
+void int64_callee (int64x4_t);
+void uint8_callee (uint8x32_t);
+void uint16_callee (uint16x16_t);
+void uint32_callee (uint32x8_t);
+void uint64_callee (uint64x4_t);
+
+void
+float16_caller (void)
+{
+  float16_callee (svdup_f16 (1.0));
+}
+
+void
+float32_caller (void)
+{
+  float32_callee (svdup_f32 (2.0));
+}
+
+void
+float64_caller (void)
+{
+  float64_callee (svdup_f64 (3.0));
+}
+
+void
+int8_caller (void)
+{
+  int8_callee (svindex_s8 (0, 1));
+}
+
+void
+int16_caller (void)
+{
+  int16_callee (svindex_s16 (0, 2));
+}
+
+void
+int32_caller (void)
+{
+  int32_callee (svindex_s32 (0, 3));
+}
+
+void
+int64_caller (void)
+{
+  int64_callee (svindex_s64 (0, 4));
+}
+
+void
+uint8_caller (void)
+{
+  uint8_callee (svindex_u8 (1, 1));
+}
+
+void
+uint16_caller (void)
+{
+  uint16_callee (svindex_u16 (1, 2));
+}
+
+void
+uint32_caller (void)
+{
+  uint32_callee (svindex_u32 (1, 3));
+}
+
+void
+uint64_caller (void)
+{
+  uint64_callee (svindex_u64 (1, 4));
+}
+
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
===================================================================
--- /dev/null	2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c	2019-12-12 15:07:43.972415287 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (svfloat16_t);
+void float32_callee (svfloat32_t);
+void float64_callee (svfloat64_t);
+void int8_callee (svint8_t);
+void int16_callee (svint16_t);
+void int32_callee (svint32_t);
+void int64_callee (svint64_t);
+void uint8_callee (svuint8_t);
+void uint16_callee (svuint16_t);
+void uint32_callee (svuint32_t);
+void uint64_callee (svuint64_t);
+
+void
+float16_caller (float16x16_t arg)
+{
+  float16_callee (arg);
+}
+
+void
+float32_caller (float32x8_t arg)
+{
+  float32_callee (arg);
+}
+
+void
+float64_caller (float64x4_t arg)
+{
+  float64_callee (arg);
+}
+
+void
+int8_caller (int8x32_t arg)
+{
+  int8_callee (arg);
+}
+
+void
+int16_caller (int16x16_t arg)
+{
+  int16_callee (arg);
+}
+
+void
+int32_caller (int32x8_t arg)
+{
+  int32_callee (arg);
+}
+
+void
+int64_caller (int64x4_t arg)
+{
+  int64_callee (arg);
+}
+
+void
+uint8_caller (uint8x32_t arg)
+{
+  uint8_callee (arg);
+}
+
+void
+uint16_caller (uint16x16_t arg)
+{
+  uint16_callee (arg);
+}
+
+void
+uint32_caller (uint32x8_t arg)
+{
+  uint32_callee (arg);
+}
+
+void
+uint64_caller (uint64x4_t arg)
+{
+  uint64_callee (arg);
+}
+
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */



More information about the Gcc-patches mailing list