This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=


Hi Kyrill,

Thanks for your comments. Answers interleaved and the new patch attached.

On 09/18/2015 11:04 AM, Kyrill Tkachov wrote:

On 15/09/15 11:47, Christian Bruel wrote:

On 09/14/2015 04:30 PM, Christian Bruel wrote:
Finally, the final part of the patch set does the attribute target
parsing and checking, redefines the preprocessor macros and implements
the inlining rules.

testcases and documentation included.

new version to remove a shadowed remnant piece of code.


   > thanks
   >
   > Christian
   >

+  /* OK to inline between different modes.
+     Function with mode specific instructions, e.g using asm,
+     must be explicitely protected with noinline.  */

s/explicitely/explicitly/


thanks


+  const struct arm_fpu_desc *fpu_desc1
+    = &all_fpus[caller_opts->x_arm_fpu_index];
+  const struct arm_fpu_desc *fpu_desc2
+    = &all_fpus[callee_opts->x_arm_fpu_index];

Please call these caller_fpu and callee_fpu, it's much easier to reason about the inlining rules that way

ok


+
+  /* Can't inline NEON extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON))
+    return false;
+
+  /* Can't inline CRYPTO extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO))
+    return false;
+

We also need to take into account FPU_FL_FP16...
In general what we want is for the callee FPU features to be
a subset of the callers features, similar to the way we handle
the x_aarch64_isa_flags handling in aarch64_can_inline_p from the
aarch64 port. I think that's the way to go here rather than explicitly
writing down a check for each feature.

ok, with FL_FP16 now,


@@ -242,6 +239,8 @@

         /* Update macros.  */
         gcc_assert (cur_opt->x_target_flags == target_flags);
+      /* This one can be redefined by the pragma without warning.  */
+      cpp_undef (parse_in, "__ARM_FP");
         arm_cpu_builtins (parse_in);

Could you elaborate why the cpp_undef here?
If you want to undefine __ARM_FP so you can redefine it to a new value
in arm_cpu_builtins then I think you should just undefine it in that function.

This is to avoid a warning: "__ARM_FP" redefined when creating a new pragma scope. (See the test attr-crypto.c).

We cannot call the cpp_undef inside arm_cpu_builtins, because it is also used for the TARGET_CPU_CPP_BUILTINS hook and then would prevent real illegitimate redefinitions.

Alternatively, I thought to reset the warn_builtin_macro_redefined flag, but that doesn't work as the macro is not NODE_BUILTIN (see the definition of warn_of_redefinition in libcpp). We might need to change this later : should target macros be marked as NOTE_BUILTIN ? We can discuss this separately (I can open a defect) as we have the cpp_undep solution for now, if you agree.



diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi
--- gnu_trunk.p3/gcc/gcc/doc/invoke.texi	2015-09-10 12:21:00.698911244 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi	2015-09-14 10:27:20.281932581 +0200
@@ -13360,6 +13363,8 @@
   floating-point arithmetic (in particular denormal values are treated as
   zero), so the use of NEON instructions may lead to a loss of precision.

+You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}).
+

s/"mfpu="/"fpu="


thanks


--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	2015-09-14 16:12:08.449698268 +0200
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */
+
+void
+f3(int n, int x[], int y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+    y[i] = x[i] << 3;
+}
+

What if GCC has been configured with --with-fpu=neon?
Then f3 will be compiled assuming NEON. You should add a -mfpu=vfp to the dg-options.

Ah yes. I've added ((target("fpu=vfp")) instead, since we are testing the attribute.



2015-05-26  Christian Bruel  <christian.bruel@st.com>

	PR target/65837
	* config/arm/arm-c.c (arm_cpu_builtins): Set or reset
	__ARM_FEATURE_CRYPTO, __VFP_FP__, __ARM_NEON__
	(arm_pragma_target_parse): Change check for arm_cpu_builtins.
	undefine __ARM_FP.
	* config/arm/arm.c (arm_can_inline_p): Check FPUs.
	(arm_valid_target_attribute_rec): Handle -mfpu attribute target.
	* doc/invoke.texi (-mfpu=): Mention attribute and pragma.
	* doc/extend.texi (-mfpu=): Describe attribute.

2015-09-14  Christian Bruel  <christian.bruel@st.com>

	PR target/65837
	gcc.target/arm/lto/pr65837_0.c
	gcc.target/arm/attr-neon2.c
	gcc.target/arm/attr-neon.c
	gcc.target/arm/attr-neon-builtin-fail.c
	gcc.target/arm/attr-crypto.c

diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm.c gnu_trunk.p4/gcc/gcc/config/arm/arm.c
--- gnu_trunk.p3/gcc/gcc/config/arm/arm.c	2015-09-21 14:07:39.218566954 +0200
+++ gnu_trunk.p4/gcc/gcc/config/arm/arm.c	2015-09-21 13:36:36.242397513 +0200
@@ -29789,11 +29788,36 @@
 /* Hook to determine if one function can safely inline another.  */
 
 static bool
-arm_can_inline_p (tree caller ATTRIBUTE_UNUSED, tree callee ATTRIBUTE_UNUSED)
+arm_can_inline_p (tree caller, tree callee)
 {
-  /* Overidde default hook: Always OK to inline between different modes. 
-     Function with mode specific instructions, e.g using asm, must be explicitely 
-     protected with noinline.  */
+  tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
+  tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
+
+  struct cl_target_option *caller_opts
+	= TREE_TARGET_OPTION (caller_tree ? caller_tree
+					   : target_option_default_node);
+
+  struct cl_target_option *callee_opts
+	= TREE_TARGET_OPTION (callee_tree ? callee_tree
+					   : target_option_default_node);
+
+  const struct arm_fpu_desc *caller_fpu
+    = &all_fpus[caller_opts->x_arm_fpu_index];
+  const struct arm_fpu_desc *callee_fpu
+    = &all_fpus[callee_opts->x_arm_fpu_index];
+
+  /* Callee's fpu features should be a subset of the caller's.  */
+  if ((caller_fpu->features & callee_fpu->features) != callee_fpu->features)
+    return false;
+
+  /* Need same model and regs.  */
+  if (callee_fpu->model != caller_fpu->model
+      || callee_fpu->regs != callee_fpu->regs)
+    return false;
+
+  /* OK to inline between different modes.
+     Function with mode specific instructions, e.g using asm,
+     must be explicitly protected with noinline.  */
   return true;
 }
 
@@ -29821,30 +29846,38 @@
     }
 
   char *argstr = ASTRDUP (TREE_STRING_POINTER (args));
-  while (argstr && *argstr != '\0')
+  char *q;
+
+  while ((q = strtok (argstr, ",")) != NULL)
     {
-      while (ISSPACE (*argstr))
-	argstr++;
+      while (ISSPACE (*q)) ++q;
 
-      if (!strcmp (argstr, "thumb"))
-	{
+      argstr = NULL;
+      if (!strncmp (q, "thumb", 5))
 	  opts->x_target_flags |= MASK_THUMB;
-	  arm_option_check_internal (opts);
-	  return true;
-	}
 
-      if (!strcmp (argstr, "arm"))
-	{
+      else if (!strncmp (q, "arm", 3))
 	  opts->x_target_flags &= ~MASK_THUMB;
-	  arm_option_check_internal (opts);
-	  return true;
+
+      else if (!strncmp (q, "fpu=", 4))
+	{
+	  if (! opt_enum_arg_to_value (OPT_mfpu_, q+4,
+				       &opts->x_arm_fpu_index, CL_TARGET))
+	    {
+	      error ("invalid fpu for attribute(target(\"%s\"))", q);
+	      return false;
+	    }
+	}
+      else
+	{
+	  error ("attribute(target(\"%s\")) is unknown", q);
+	  return false;
 	}
 
-      warning (0, "attribute(target(\"%s\")) is unknown", argstr);
-      return false;
+      arm_option_check_internal (opts);
     }
 
-  return false;
+  return true;
 }
 
 /* Return a TARGET_OPTION_NODE tree of the target options listed or NULL.  */
diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c
--- gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c	2015-09-21 14:07:12.186506227 +0200
+++ gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c	2015-09-21 13:46:02.655664904 +0200
@@ -68,8 +68,8 @@
   def_or_undef_macro (pfile, "__ARM_FEATURE_DSP", TARGET_DSP_MULTIPLY);
   def_or_undef_macro (pfile, "__ARM_FEATURE_QBIT", TARGET_ARM_QBIT); 
   def_or_undef_macro (pfile, "__ARM_FEATURE_SAT", TARGET_ARM_SAT);
-  if (TARGET_CRYPTO)
-    builtin_define ("__ARM_FEATURE_CRYPTO");
+  def_or_undef_macro (pfile, "__ARM_FEATURE_CRYPTO", TARGET_CRYPTO);
+
   if (unaligned_access)
     builtin_define ("__ARM_FEATURE_UNALIGNED");
   if (TARGET_CRC32)
@@ -129,8 +129,7 @@
   if (TARGET_SOFT_FLOAT)
     builtin_define ("__SOFTFP__");
 
-  if (TARGET_VFP)
-    builtin_define ("__VFP_FP__");
+  def_or_undef_macro (pfile, "__VFP_FP__", TARGET_VFP);
 	
   if (TARGET_ARM_FP)
     builtin_define_with_int_value ("__ARM_FP", TARGET_ARM_FP);
@@ -141,11 +140,9 @@
   if (TARGET_FMA)
     builtin_define ("__ARM_FEATURE_FMA");
 
-  if (TARGET_NEON)
-    {
-      builtin_define ("__ARM_NEON__");
-      builtin_define ("__ARM_NEON");
-    }
+  def_or_undef_macro (pfile, "__ARM_NEON__", TARGET_NEON);
+  def_or_undef_macro (pfile, "__ARM_NEON", TARGET_NEON);
+
   if (TARGET_NEON_FP)
     builtin_define_with_int_value ("__ARM_NEON_FP", TARGET_NEON_FP);
   
@@ -232,7 +228,7 @@
   gcc_assert (prev_opt);
   gcc_assert (cur_opt);
 
-  if (cur_opt->x_target_flags != prev_opt->x_target_flags)
+  if (cur_opt != prev_opt)
     {
       /* For the definitions, ensure all newly defined macros are considered
 	 as used for -Wunused-macros.  There is no point warning about the
@@ -243,6 +239,8 @@
 
       /* Update macros.  */
       gcc_assert (cur_opt->x_target_flags == target_flags);
+      /* This one can be redefined by the pragma without warning.  */
+      cpp_undef (parse_in, "__ARM_FP");
       arm_cpu_builtins (parse_in);
 
       cpp_opts->warn_unused_macros = saved_warn_unused_macros;
diff -ruN gnu_trunk.p3/gcc/gcc/doc/extend.texi gnu_trunk.p4/gcc/gcc/doc/extend.texi
--- gnu_trunk.p3/gcc/gcc/doc/extend.texi	2015-09-07 13:35:20.777683005 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/extend.texi	2015-09-14 13:58:49.271385001 +0200
@@ -3606,10 +3606,17 @@
 @item arm
 @cindex @code{target("arm")} function attribute, ARM
 Force code generation in the ARM (A32) ISA.
-@end table
 
 Functions from different modes can be inlined in the caller's mode.
 
+@item fpu=
+@cindex @code{target("fpu=")} function attribute, ARM
+Specifies the fpu for which to tune the performance of this function.
+The behavior and permissible arguments are the same as for the @option{-mfpu=}
+command-line option.
+
+@end table
+
 @end table
 
 @node AVR Function Attributes
diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi
--- gnu_trunk.p3/gcc/gcc/doc/invoke.texi	2015-09-21 13:35:49.274292268 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi	2015-09-21 13:36:18.798358427 +0200
@@ -13386,6 +13386,8 @@
 floating-point arithmetic (in particular denormal values are treated as
 zero), so the use of NEON instructions may lead to a loss of precision.
 
+You can also set the fpu name at function level by using the @code{target("fpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}).
+
 @item -mfp16-format=@var{name}
 @opindex mfp16-format
 Specify the format of the @code{__fp16} half-precision floating-point type.
diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c	2015-09-14 15:58:24.967898634 +0200
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_crypto_ok } */
+
+#pragma GCC target ("fpu=crypto-neon-fp-armv8")
+
+#ifndef __ARM_FEATURE_CRYPTO
+#error __ARM_FEATURE_CRYPTO not defined.
+#endif
+
+#ifndef __ARM_NEON
+#error __ARM_NEON not defined.
+#endif
+
+#if !defined(__ARM_FP) || (__ARM_FP != 14)
+#error __ARM_FP
+#endif
+
+#include "arm_neon.h"
+
+int
+foo (void)
+{
+  uint32x4_t a = {0xd, 0xe, 0xa, 0xd};
+  uint32x4_t b = {0, 1, 2, 3};
+
+  uint32x4_t res = vsha256su0q_u32 (a, b);
+  return res[0];
+}
+
+#pragma GCC reset_options
+
+/* Check that the FP version is correctly reset.  */
+
+#if !defined(__ARM_FP) || (__ARM_FP != 12)
+#error __ARM_FP
+#endif
+
+/* { dg-final { scan-assembler "sha256su0.32\tq\[0-9\]+, q\[0-9\]+" } } */
diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c	2015-09-14 15:58:24.967898634 +0200
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=vfp" } */
+
+#pragma GCC target ("fpu=neon")
+#include <arm_neon.h>
+
+/* Check that pragma target is used.  */
+int8x8_t 
+my (int8x8_t __a, int8x8_t __b)
+{
+  return __a + __b;
+}
+
+#pragma GCC reset_options
+
+/* Check that command line option is restored.  */
+int8x8_t 
+my1 (int8x8_t __a, int8x8_t __b)
+{
+  return __a + __b;
+}
+
+/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
+/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */
+/* { dg-final { scan-assembler "vadd" } } */
+
+
diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c	2015-09-14 15:58:24.967898634 +0200
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=neon" } */
+
+#include <arm_neon.h>
+
+void __attribute__ ((target ("fpu=vfp")))
+foo (uint8x16_t *p)  
+{
+  *p = vmovq_n_u8 (3); /* { dg-error "called from here" } */
+
+}
+
+
+/* { dg-error "inlining failed in call to always_inline" "" { target *-*-* } 0 } */
+
+
+
diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	2015-09-21 13:43:45.983359388 +0200
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -mfloat-abi=softfp -ftree-vectorize" } */
+
+/* Verify that neon instructions are emitted once.  */
+void __attribute__ ((target("fpu=neon")))
+ f1(int n, int x[], int y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+    y[i] = x[i] << 3;
+}
+
+void __attribute__ ((target("fpu=vfp")))
+f3(int n, int x[], int y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+    y[i] = x[i] << 3;
+}
+
+/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
+/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */
+/* { dg-final { scan-assembler-times "vshl" 1 } } */
+
+
+
+
diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c	2015-09-14 15:58:13.899874587 +0200
@@ -0,0 +1,14 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options {{-flto -mfpu=neon}} } */
+/* { dg-suppress-ld-options {-mfpu=neon} } */
+
+#include "arm_neon.h"
+
+float32x2_t a, b, c, e;
+
+int main()
+{
+  e = __builtin_neon_vmls_lanev2sf (a, b, c, 0);
+  return 0;
+}
+

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]