This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/16][ARM] Add float16x4_t intrinsics


Kyrill Tkachov wrote:
On 07/07/15 17:34, Alan Lawrence wrote:
Kyrill Tkachov wrote:
On 07/07/15 14:09, Kyrill Tkachov wrote:
Hi Alan,

On 07/07/15 13:34, Alan Lawrence wrote:
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01335.html
For some context, the reference for these is at:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf

This patch is ok once you and Charles decide on how to proceed with the two prerequisites.
On second thought, the ACLE document at http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

says in 12.2.1:
"float16 types are only available when the __fp16 type is defined, i.e. when supported by the hardware"
However, we support __fp16 whenever the user specifies -mfp16-format=ieee or
-mfp16-format=alternative, regardless of whether we have hardware support or not.

(Without hardware support, gcc generates calls to  __gnu_f2h_ieee or
__gnu_f2h_alternative instead of vcvtb.f16.f32, and  __gnu_h2f_ieee or
__gnu_h2f_alternative instead of vcvtb.f32.f16. However, there is no way to
support __fp16 just using those hardware instructions without caring about which
format is in use.)

Hmmm... In my opinion intrinsics should aim to map to instructions rather than go away and
call library functions, but this is the existing functionality
that current users might depend on :(

Sorry - to clarify: currently we generate __gnu_f2h_ieee / __gnu_h2f_ieee, to convert between single __fp16 and 'float' values, when there is no HW. General operations on scalar __fp16 values are performed by converting to float, performing operations on float, and converting back. The __fp16 type is available and "usable" without HW support, but only when -mfp16-format is specified.

(The existing) intrinsics operating on float16x[48] vectors (converting to/from float32x4) are *not* available without hardware support; these intrinsics *are* available without specifying -mfp16-format.

ACLE (4.1.2) allows toolchains to provide __fp16 when not implemented in HW, even if this is not required.

CC'ing the ARM maintainers and Tejas for an ACLE perspective.
I think that we'd want to gate the definition of __fp16 on hardware availability as well
(the -mfpu option) rather than just arm_fp16_format but I'm not sure of the impact this will have
on existing users.

Sure....but do we require -mfpu *and* -mfp16-format? s/and/or/ ? Do we require -mfp16-format for float16x[48] intrinsics, or allow format-agnostic code (as HW support allows us to!)?

I don't have very strong opinions as to which way we should go, I merely tried to be consistent with the existing codebase, and to support as much code as possible, although I agree I ignored cases where defining functions unexpectedly might cause problems.

Cheers, Alan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]