This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=

From: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
To: Christian Bruel <christian dot bruel at st dot com>
Cc: Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Thu, 08 Oct 2015 09:52:56 +0100
Subject: Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=
Authentication-results: sourceware.org; auth=none
References: <55F6D9FF dot 4030600 at st dot com> <55F7F75E dot 4070800 at st dot com> <55FBD3B4 dot 9050709 at arm dot com> <5600096E dot 4030403 at st dot com>

Hi Christian,

On 21/09/15 14:43, Christian Bruel wrote:

Hi Kyrill,

Thanks for your comments. Answers interleaved and the new patch attached.

On 09/18/2015 11:04 AM, Kyrill Tkachov wrote:

On 15/09/15 11:47, Christian Bruel wrote:

On 09/14/2015 04:30 PM, Christian Bruel wrote:

Finally, the final part of the patch set does the attribute target
parsing and checking, redefines the preprocessor macros and implements
the inlining rules.

testcases and documentation included.

new version to remove a shadowed remnant piece of code.


    > thanks
    >
    > Christian
    >

+  /* OK to inline between different modes.
+     Function with mode specific instructions, e.g using asm,
+     must be explicitely protected with noinline.  */

s/explicitely/explicitly/

thanks

+  const struct arm_fpu_desc *fpu_desc1
+    = &all_fpus[caller_opts->x_arm_fpu_index];
+  const struct arm_fpu_desc *fpu_desc2
+    = &all_fpus[callee_opts->x_arm_fpu_index];

Please call these caller_fpu and callee_fpu, it's much easier to reason about the inlining rules that way

ok

+
+  /* Can't inline NEON extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON))
+    return false;
+
+  /* Can't inline CRYPTO extension if the caller doesn't support it.  */
+  if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO)
+      && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO))
+    return false;
+

We also need to take into account FPU_FL_FP16...
In general what we want is for the callee FPU features to be
a subset of the callers features, similar to the way we handle
the x_aarch64_isa_flags handling in aarch64_can_inline_p from the
aarch64 port. I think that's the way to go here rather than explicitly
writing down a check for each feature.

ok, with FL_FP16 now,

@@ -242,6 +239,8 @@

          /* Update macros.  */
          gcc_assert (cur_opt->x_target_flags == target_flags);
+      /* This one can be redefined by the pragma without warning.  */
+      cpp_undef (parse_in, "__ARM_FP");
          arm_cpu_builtins (parse_in);

Could you elaborate why the cpp_undef here?
If you want to undefine __ARM_FP so you can redefine it to a new value
in arm_cpu_builtins then I think you should just undefine it in that function.

This is to avoid a warning: "__ARM_FP" redefined when creating a new
pragma scope. (See the test attr-crypto.c).

We cannot call the cpp_undef inside arm_cpu_builtins, because it is also
used for the TARGET_CPU_CPP_BUILTINS hook and then would prevent real
illegitimate redefinitions.

Alternatively, I thought to reset the warn_builtin_macro_redefined flag,
but that doesn't work as the macro is not NODE_BUILTIN (see the
definition of warn_of_redefinition in libcpp).
We might need to change this later : should target macros be marked as
NOTE_BUILTIN ? We can discuss this separately (I can open a defect) as
we have the cpp_undep solution for now, if you agree.


diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi
--- gnu_trunk.p3/gcc/gcc/doc/invoke.texi	2015-09-10 12:21:00.698911244 +0200
+++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi	2015-09-14 10:27:20.281932581 +0200
@@ -13360,6 +13363,8 @@
    floating-point arithmetic (in particular denormal values are treated as
    zero), so the use of NEON instructions may lead to a loss of precision.

+You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}).
+

s/"mfpu="/"fpu="

thanks

--- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	1970-01-01 01:00:00.000000000 +0100
+++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c	2015-09-14 16:12:08.449698268 +0200
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */
+
+void
+f3(int n, int x[], int y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+    y[i] = x[i] << 3;
+}
+

What if GCC has been configured with --with-fpu=neon?
Then f3 will be compiled assuming NEON. You should add a -mfpu=vfp to the dg-options.

Ah yes. I've added ((target("fpu=vfp")) instead, since we are testing
the attribute.


2015-05-26  Christian Bruel<christian.bruel@st.com>

	PR target/65837
	* config/arm/arm-c.c (arm_cpu_builtins): Set or reset
	__ARM_FEATURE_CRYPTO, __VFP_FP__, __ARM_NEON__
	(arm_pragma_target_parse): Change check for arm_cpu_builtins.
	undefine __ARM_FP.
	* config/arm/arm.c (arm_can_inline_p): Check FPUs.
	(arm_valid_target_attribute_rec): Handle -mfpu attribute target.
	* doc/invoke.texi (-mfpu=): Mention attribute and pragma.
	* doc/extend.texi (-mfpu=): Describe attribute.

2015-09-14  Christian Bruel<christian.bruel@st.com>

	PR target/65837
	gcc.target/arm/lto/pr65837_0.c
	gcc.target/arm/attr-neon2.c
	gcc.target/arm/attr-neon.c
	gcc.target/arm/attr-neon-builtin-fail.c
	gcc.target/arm/attr-crypto.c

The parts in this patch look ok to me.
However, I think we need some more functionality
In aarch64 we support compiling a file with no simd, including arm_neon.h and using arm_neon.h intrinsics
within functions tagged with simd support.
We want to support such functionality on arm i.e. compile a file with -mfpu=vfp and use arm_neon.h intrinsics
in a function tagged with an fpu=neon attribute.
For that we'd need to wrap the intrinsics in arm_neon.h in appropriate pragmas, like in the aarch64 version of arm_neon.h

Thanks,
Kyrill

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]