[PATCH 6/9][GCC][AArch64] Add Armv8.3-a complex intrinsics

Tamar Christina Tamar.Christina@arm.com
Fri Dec 21 17:59:00 GMT 2018


Hi All,

This updated patch adds NEON intrinsics and tests for the Armv8.3-a complex
multiplication and add instructions with a rotate along the Argand plane.

The instructions are documented in the ArmARM[1] and the intrinsics specification
will be published on the Arm website [2].

The Lane versions of these instructions are special in that they always select a pair.
using index 0 means selecting lane 0 and 1.  Because of this the range check for the
intrinsics require special handling.

There're a few complexities with the intrinsics for the laneq variants for AArch64:

1) The architecture does not have a version for V2SF. However since the instructions always
   selects a pair of values, the only valid index for V2SF would have been 0. As such the lane
   versions for V2SF are all mapped to the 3SAME variant of the instructions and not the By element
   variant.

2) Because of no# 1 above, the laneq versions of the instruction become tricky. The valid indices are 0 and 1.
   For index 0 we treat it the same as the lane version of this instruction and just pass the lower half of the
   register to the 3SAME instruction.  When index is 1 we extract the upper half of the register and pass that to
   the 3SAME version of the instruction.

2) The architecture forbits the laneq version of the V4HF instruction from having an index greater than 1.  For index 0-1
   we do no extra work. For index 2-3 we extract the upper parts of the register and pass that to the instruction it would
   have normally used, and re-map the index into a range of 0-1.

[1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
[2] https://developer.arm.com/docs/101028/latest

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Additional runtime checks done but not posted with the patch.

Ok for trunk?

Thanks,
Tamar

gcc/ChangeLog:

2018-12-22  Tamar Christina  <tamar.christina@arm.com>

	* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers): Add qualifier_lane_pair_index.
	(emit-rtl.h): Include.
	(TYPES_QUADOP_LANE_PAIR): New.
	(aarch64_simd_expand_args): Use it.
	(aarch64_simd_expand_builtin): Likewise.
	(AARCH64_SIMD_FCMLA_LANEQ_BUILTINS, aarch64_fcmla_laneq_builtin_datum): New.
	(FCMLA_LANEQ_BUILTIN, AARCH64_SIMD_FCMLA_LANEQ_BUILTIN_BASE,
	AARCH64_SIMD_FCMLA_LANEQ_BUILTINS, aarch64_fcmla_lane_builtin_data,
	aarch64_init_fcmla_laneq_builtins): New.
	(aarch64_init_builtins): Add aarch64_init_fcmla_laneq_builtins.
	(aarch64_expand_buildin): Add AARCH64_SIMD_BUILTIN_FCMLA_LANEQ0_V2SF,
	AARCH64_SIMD_BUILTIN_FCMLA_LANEQ90_V2SF, AARCH64_SIMD_BUILTIN_FCMLA_LANEQ180_V2SF,
 	AARCH64_SIMD_BUILTIN_FCMLA_LANEQ2700_V2SF, AARCH64_SIMD_BUILTIN_FCMLA_LANEQ0_V4HF,
	AARCH64_SIMD_BUILTIN_FCMLA_LANEQ90_V4HF, AARCH64_SIMD_BUILTIN_FCMLA_LANEQ180_V4HF,
	AARCH64_SIMD_BUILTIN_FCMLA_LANEQ270_V4HF.
	* config/aarch64/iterators.md (FCMLA_maybe_lane): New.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add __ARM_FEATURE_COMPLEX.
	* config/aarch64/aarch64-simd-builtins.def (fcadd90, fcadd270, fcmla0, fcmla90,
	fcmla180, fcmla270, fcmla_lane0, fcmla_lane90, fcmla_lane180, fcmla_lane270,
	fcmla_laneq0, fcmla_laneq90, fcmla_laneq180, fcmla_laneq270,
	fcmlaq_lane0, fcmlaq_lane90, fcmlaq_lane180, fcmlaq_lane270): New.
	* config/aarch64/aarch64-simd.md (aarch64_fcmla_lane<rot><mode>,
	aarch64_fcmla_laneq<rot>v4hf, aarch64_fcmlaq_lane<rot><mode>): New.
	* config/aarch64/arm_neon.h:
	(vcadd_rot90_f16): New.
	(vcaddq_rot90_f16): New.
	(vcadd_rot270_f16): New.
	(vcaddq_rot270_f16): New.
	(vcmla_f16): New.
	(vcmlaq_f16): New.
	(vcmla_lane_f16): New.
	(vcmla_laneq_f16): New.
	(vcmlaq_lane_f16): New.
	(vcmlaq_rot90_lane_f16): New.
	(vcmla_rot90_laneq_f16): New.
	(vcmla_rot90_lane_f16): New.
	(vcmlaq_rot90_f16): New.
	(vcmla_rot90_f16): New.
	(vcmlaq_laneq_f16): New.
	(vcmla_rot180_laneq_f16): New.
	(vcmla_rot180_lane_f16): New.
	(vcmlaq_rot180_f16): New.
	(vcmla_rot180_f16): New.
	(vcmlaq_rot90_laneq_f16): New.
	(vcmlaq_rot270_laneq_f16): New.
	(vcmlaq_rot270_lane_f16): New.
	(vcmla_rot270_laneq_f16): New.
	(vcmlaq_rot270_f16): New.
	(vcmla_rot270_f16): New.
	(vcmlaq_rot180_laneq_f16): New.
	(vcmlaq_rot180_lane_f16): New.
	(vcmla_rot270_lane_f16): New.
	(vcadd_rot90_f32): New.
	(vcaddq_rot90_f32): New.
	(vcaddq_rot90_f64): New.
	(vcadd_rot270_f32): New.
	(vcaddq_rot270_f32): New.
	(vcaddq_rot270_f64): New.
	(vcmla_f32): New.
	(vcmlaq_f32): New.
	(vcmlaq_f64): New.
	(vcmla_lane_f32): New.
	(vcmla_laneq_f32): New.
	(vcmlaq_lane_f32): New.
	(vcmlaq_laneq_f32): New.
	(vcmla_rot90_f32): New.
	(vcmlaq_rot90_f32): New.
	(vcmlaq_rot90_f64): New.
	(vcmla_rot90_lane_f32): New.
	(vcmla_rot90_laneq_f32): New.
	(vcmlaq_rot90_lane_f32): New.
	(vcmlaq_rot90_laneq_f32): New.
	(vcmla_rot180_f32): New.
	(vcmlaq_rot180_f32): New.
	(vcmlaq_rot180_f64): New.
	(vcmla_rot180_lane_f32): New.
	(vcmla_rot180_laneq_f32): New.
	(vcmlaq_rot180_lane_f32): New.
	(vcmlaq_rot180_laneq_f32): New.
	(vcmla_rot270_f32): New.
	(vcmlaq_rot270_f32): New.
	(vcmlaq_rot270_f64): New.
	(vcmla_rot270_lane_f32): New.
	(vcmla_rot270_laneq_f32): New.
	(vcmlaq_rot270_lane_f32): New.
	(vcmlaq_rot270_laneq_f32): New.

gcc/testsuite/ChangeLog:

2018-12-22  Tamar Christina  <tamar.christina@arm.com>

	* gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: New test.
	* gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: New test.

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
> On Behalf Of Tamar Christina
> Sent: Tuesday, December 11, 2018 15:47
> To: gcc-patches@gcc.gnu.org
> Cc: nd <nd@arm.com>; James Greenhalgh <James.Greenhalgh@arm.com>;
> Richard Earnshaw <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>
> Subject: [PATCH 6/9][GCC][AArch64] Add Armv8.3-a complex intrinsics
> 
> Hi All,
> 
> This patch adds NEON intrinsics and tests for the Armv8.3-a complex
> multiplication and add instructions with a rotate along the Argand plane.
> 
> The instructions are documented in the ArmARM[1] and the intrinsics
> specification will be published on the Arm website [2].
> 
> The Lane versions of these instructions are special in that they always select a
> pair.
> using index 0 means selecting lane 0 and 1.  Because of this the range check
> for the intrinsics require special handling.
> 
> [1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-
> reference-manual-armv8-for-armv8-a-architecture-profile
> [2] https://developer.arm.com/docs/101028/latest
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 2018-12-11  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
> Add qualifier_lane_pair_index.
> 	(TYPES_QUADOP_LANE_PAIR): New.
> 	(aarch64_simd_expand_args): Use it.
> 	(aarch64_simd_expand_builtin): Likewise.
> 	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add
> __ARM_FEATURE_COMPLEX.
> 	* config/aarch64/aarch64-simd-builtins.def (fcadd90, fcadd270,
> fcmla0, fcmla90,
> 	fcmla180, fcmla270, fcmla_lane0, fcmla_lane90, fcmla_lane180,
> fcmla_lane270,
> 	fcmla_laneq0, fcmla_laneq90, fcmla_laneq180, fcmla_laneq270,
> 	fcmlaq_lane0, fcmlaq_lane90, fcmlaq_lane180, fcmlaq_lane270):
> New.
> 	* config/aarch64/aarch64-simd.md
> (aarch64_fcmla_lane<rot><mode>,
> 	aarch64_fcmla_laneq<rot><mode>,
> aarch64_fcmlaq_lane<rot><mode>): New.
> 	* config/aarch64/arm_neon.h:
> 	(vcadd_rot90_f16): New.
> 	(vcaddq_rot90_f16): New.
> 	(vcadd_rot270_f16): New.
> 	(vcaddq_rot270_f16): New.
> 	(vcmla_f16): New.
> 	(vcmlaq_f16): New.
> 	(vcmla_lane_f16): New.
> 	(vcmla_laneq_f16): New.
> 	(vcmlaq_lane_f16): New.
> 	(vcmlaq_rot90_lane_f16): New.
> 	(vcmla_rot90_laneq_f16): New.
> 	(vcmla_rot90_lane_f16): New.
> 	(vcmlaq_rot90_f16): New.
> 	(vcmla_rot90_f16): New.
> 	(vcmlaq_laneq_f16): New.
> 	(vcmla_rot180_laneq_f16): New.
> 	(vcmla_rot180_lane_f16): New.
> 	(vcmlaq_rot180_f16): New.
> 	(vcmla_rot180_f16): New.
> 	(vcmlaq_rot90_laneq_f16): New.
> 	(vcmlaq_rot270_laneq_f16): New.
> 	(vcmlaq_rot270_lane_f16): New.
> 	(vcmla_rot270_laneq_f16): New.
> 	(vcmlaq_rot270_f16): New.
> 	(vcmla_rot270_f16): New.
> 	(vcmlaq_rot180_laneq_f16): New.
> 	(vcmlaq_rot180_lane_f16): New.
> 	(vcmla_rot270_lane_f16): New.
> 	(vcadd_rot90_f32): New.
> 	(vcaddq_rot90_f32): New.
> 	(vcaddq_rot90_f64): New.
> 	(vcadd_rot270_f32): New.
> 	(vcaddq_rot270_f32): New.
> 	(vcaddq_rot270_f64): New.
> 	(vcmla_f32): New.
> 	(vcmlaq_f32): New.
> 	(vcmlaq_f64): New.
> 	(vcmla_lane_f32): New.
> 	(vcmla_laneq_f32): New.
> 	(vcmlaq_lane_f32): New.
> 	(vcmlaq_laneq_f32): New.
> 	(vcmla_rot90_f32): New.
> 	(vcmlaq_rot90_f32): New.
> 	(vcmlaq_rot90_f64): New.
> 	(vcmla_rot90_lane_f32): New.
> 	(vcmla_rot90_laneq_f32): New.
> 	(vcmlaq_rot90_lane_f32): New.
> 	(vcmlaq_rot90_laneq_f32): New.
> 	(vcmla_rot180_f32): New.
> 	(vcmlaq_rot180_f32): New.
> 	(vcmlaq_rot180_f64): New.
> 	(vcmla_rot180_lane_f32): New.
> 	(vcmla_rot180_laneq_f32): New.
> 	(vcmlaq_rot180_lane_f32): New.
> 	(vcmlaq_rot180_laneq_f32): New.
> 	(vcmla_rot270_f32): New.
> 	(vcmlaq_rot270_f32): New.
> 	(vcmlaq_rot270_f64): New.
> 	(vcmla_rot270_lane_f32): New.
> 	(vcmla_rot270_laneq_f32): New.
> 	(vcmlaq_rot270_lane_f32): New.
> 	(vcmlaq_rot270_laneq_f32): New.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-12-11  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: New test.
> 	* gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: New
> test.
> 
> --
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rb10281.patch
Type: text/x-diff
Size: 55315 bytes
Desc: rb10281.patch
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181221/f442ba15/attachment.bin>


More information about the Gcc-patches mailing list