This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=
- From: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- To: Christian Bruel <christian dot bruel at st dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
- Date: Fri, 18 Sep 2015 10:04:52 +0100
- Subject: Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=
- Authentication-results: sourceware.org; auth=none
- References: <55F6D9FF dot 4030600 at st dot com> <55F7F75E dot 4070800 at st dot com>
On 15/09/15 11:47, Christian Bruel wrote:
On 09/14/2015 04:30 PM, Christian Bruel wrote:
Finally, the final part of the patch set does the attribute target
parsing and checking, redefines the preprocessor macros and implements
the inlining rules.
testcases and documentation included.
new version to remove a shadowed remnant piece of code.
> thanks
>
> Christian
>
+ /* OK to inline between different modes.
+ Function with mode specific instructions, e.g using asm,
+ must be explicitely protected with noinline. */
s/explicitely/explicitly/
+ const struct arm_fpu_desc *fpu_desc1
+ = &all_fpus[caller_opts->x_arm_fpu_index];
+ const struct arm_fpu_desc *fpu_desc2
+ = &all_fpus[callee_opts->x_arm_fpu_index];
Please call these caller_fpu and callee_fpu, it's much easier to reason about the inlining rules that way
+
+ /* Can't inline NEON extension if the caller doesn't support it. */
+ if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON)
+ && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON))
+ return false;
+
+ /* Can't inline CRYPTO extension if the caller doesn't support it. */
+ if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO)
+ && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO))
+ return false;
+
We also need to take into account FPU_FL_FP16...
In general what we want is for the callee FPU features to be
a subset of the callers features, similar to the way we handle
the x_aarch64_isa_flags handling in aarch64_can_inline_p from the
aarch64 port. I think that's the way to go here rather than explicitly
writing down a check for each feature.
@@ -242,6 +239,8 @@
/* Update macros. */
gcc_assert (cur_opt->x_target_flags == target_flags);
+ /* This one can be redefined by the pragma without warning. */
+ cpp_undef (parse_in, "__ARM_FP");
arm_cpu_builtins (parse_in);
Could you elaborate why the cpp_undef here?
If you want to undefine __ARM_FP so you can redefine it to a new value
in arm_cpu_builtins then I think you should just undefine it in that function.
diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi
--- gnu_trunk.p3/gcc/gcc/doc/invoke.texi 2015-09-10 12:21:00.698911244 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi 2015-09-14 10:27:20.281932581 +0200
@@ -13360,6 +13363,8 @@
floating-point arithmetic (in particular denormal values are treated as
zero), so the use of NEON instructions may lead to a loss of precision.
+You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}).
+
s/"mfpu="/"fpu="
--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 2015-09-14 16:12:08.449698268 +0200
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */
+
+void
+f3(int n, int x[], int y[]) {
+ int i;
+ for (i = 0; i < n; ++i)
+ y[i] = x[i] << 3;
+}
+
What if GCC has been configured with --with-fpu=neon?
Then f3 will be compiled assuming NEON. You should add a -mfpu=vfp to the dg-options.