This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=



On 15/09/15 11:47, Christian Bruel wrote:

On 09/14/2015 04:30 PM, Christian Bruel wrote:
Finally, the final part of the patch set does the attribute target
parsing and checking, redefines the preprocessor macros and implements
the inlining rules.

testcases and documentation included.

new version to remove a shadowed remnant piece of code.


  > thanks
  >
  > Christian
  >

+  /* OK to inline between different modes.
+     Function with mode specific instructions, e.g using asm,
+     must be explicitely protected with noinline.  */

s/explicitely/explicitly/


+  const struct arm_fpu_desc *fpu_desc1
+    = &all_fpus[caller_opts->x_arm_fpu_index];
+  const struct arm_fpu_desc *fpu_desc2
+    = &all_fpus[callee_opts->x_arm_fpu_index];

Please call these caller_fpu and callee_fpu, it's much easier to reason about the inlining rules that way

+
+  /* Can't inline NEON extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON))
+    return false;
+
+  /* Can't inline CRYPTO extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO))
+    return false;
+

We also need to take into account FPU_FL_FP16...
In general what we want is for the callee FPU features to be
a subset of the callers features, similar to the way we handle
the x_aarch64_isa_flags handling in aarch64_can_inline_p from the
aarch64 port. I think that's the way to go here rather than explicitly
writing down a check for each feature.

@@ -242,6 +239,8 @@
/* Update macros. */
       gcc_assert (cur_opt->x_target_flags == target_flags);
+      /* This one can be redefined by the pragma without warning.  */
+      cpp_undef (parse_in, "__ARM_FP");
       arm_cpu_builtins (parse_in);
Could you elaborate why the cpp_undef here?
If you want to undefine __ARM_FP so you can redefine it to a new value
in arm_cpu_builtins then I think you should just undefine it in that function.


diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi
--- gnu_trunk.p3/gcc/gcc/doc/invoke.texi	2015-09-10 12:21:00.698911244 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi	2015-09-14 10:27:20.281932581 +0200
@@ -13360,6 +13363,8 @@
 floating-point arithmetic (in particular denormal values are treated as
 zero), so the use of NEON instructions may lead to a loss of precision.
+You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}).
+

s/"mfpu="/"fpu="


--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	2015-09-14 16:12:08.449698268 +0200
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */
+
+void
+f3(int n, int x[], int y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+    y[i] = x[i] << 3;
+}
+

What if GCC has been configured with --with-fpu=neon?
Then f3 will be compiled assuming NEON. You should add a -mfpu=vfp to the dg-options.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]