[PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex mutliplication and addition
Tamar Christina
Tamar.Christina@arm.com
Fri Dec 21 11:01:00 GMT 2018
Ping
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
> On Behalf Of Tamar Christina
> Sent: Tuesday, December 11, 2018 15:47
> To: gcc-patches@gcc.gnu.org
> Cc: nd <nd@arm.com>; Ramana Radhakrishnan
> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; nickc@redhat.com; Kyrylo Tkachov
> <Kyrylo.Tkachov@arm.com>
> Subject: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex
> mutliplication and addition
>
> Hi All,
>
> This patch adds NEON intrinsics and tests for the Armv8.3-a complex
> multiplication and add instructions with a rotate along the Argand plane.
>
> The instructions are documented in the ArmARM[1] and the intrinsics
> specification will be published on the Arm website [2].
>
> The Lane versions of these instructions are special in that they always select a
> pair.
> using index 0 means selecting lane 0 and 1. Because of this the range check
> for the intrinsics require special handling.
>
> On Arm, in order to implement some of the lane intrinsics we're using the
> structure of the register file. The lane variant of these instructions always
> select a D register, but the data itself can be stored in Q registers. This means
> that for single precision complex numbers you are only allowed to select D[0]
> but using the register file layout you can get the range 0-1 for lane indices by
> selecting between Dn[0] and Dn+1[0].
>
> Same reasoning applies for half float complex numbers, except there your D
> register indexes can be 0 or 1, so you have a total range of 4 elements (for a
> V8HF).
>
>
> [1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-
> reference-manual-armv8-for-armv8-a-architecture-profile
> [2] https://developer.arm.com/docs/101028/latest
>
> Bootstrapped Regtested on arm-none-gnueabihf and no issues.
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> 2018-12-11 Tamar Christina <tamar.christina@arm.com>
>
> * config/arm/arm-builtins.c
> (enum arm_type_qualifiers): Add qualifier_lane_pair_index.
> (MAC_LANE_PAIR_QUALIFIERS): New.
> (arm_expand_builtin_args): Use it.
> (arm_expand_builtin_1): Likewise.
> * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands):
> New.
> * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New.
> * config/arm/arm-c.c (arm_cpu_builtins): Add
> __ARM_FEATURE_COMPLEX.
> * config/arm/arm_neon.h:
> (vcadd_rot90_f16): New.
> (vcaddq_rot90_f16): New.
> (vcadd_rot270_f16): New.
> (vcaddq_rot270_f16): New.
> (vcmla_f16): New.
> (vcmlaq_f16): New.
> (vcmla_lane_f16): New.
> (vcmla_laneq_f16): New.
> (vcmlaq_lane_f16): New.
> (vcmlaq_laneq_f16): New.
> (vcmla_rot90_f16): New.
> (vcmlaq_rot90_f16): New.
> (vcmla_rot90_lane_f16): New.
> (vcmla_rot90_laneq_f16): New.
> (vcmlaq_rot90_lane_f16): New.
> (vcmlaq_rot90_laneq_f16): New.
> (vcmla_rot180_f16): New.
> (vcmlaq_rot180_f16): New.
> (vcmla_rot180_lane_f16): New.
> (vcmla_rot180_laneq_f16): New.
> (vcmlaq_rot180_lane_f16): New.
> (vcmlaq_rot180_laneq_f16): New.
> (vcmla_rot270_f16): New.
> (vcmlaq_rot270_f16): New.
> (vcmla_rot270_lane_f16): New.
> (vcmla_rot270_laneq_f16): New.
> (vcmlaq_rot270_lane_f16): New.
> (vcmlaq_rot270_laneq_f16): New.
> (vcadd_rot90_f32): New.
> (vcaddq_rot90_f32): New.
> (vcadd_rot270_f32): New.
> (vcaddq_rot270_f32): New.
> (vcmla_f32): New.
> (vcmlaq_f32): New.
> (vcmla_lane_f32): New.
> (vcmla_laneq_f32): New.
> (vcmlaq_lane_f32): New.
> (vcmlaq_laneq_f32): New.
> (vcmla_rot90_f32): New.
> (vcmlaq_rot90_f32): New.
> (vcmla_rot90_lane_f32): New.
> (vcmla_rot90_laneq_f32): New.
> (vcmlaq_rot90_lane_f32): New.
> (vcmlaq_rot90_laneq_f32): New.
> (vcmla_rot180_f32): New.
> (vcmlaq_rot180_f32): New.
> (vcmla_rot180_lane_f32): New.
> (vcmla_rot180_laneq_f32): New.
> (vcmlaq_rot180_lane_f32): New.
> (vcmlaq_rot180_laneq_f32): New.
> (vcmla_rot270_f32): New.
> (vcmlaq_rot270_f32): New.
> (vcmla_rot270_lane_f32): New.
> (vcmla_rot270_laneq_f32): New.
> (vcmlaq_rot270_lane_f32): New.
> (vcmlaq_rot270_laneq_f32): New.
> * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0,
> vcmla90,
> vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180,
> vcmla_lane270,
> vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270,
> vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270):
> New.
> * config/arm/neon.md (neon_vcmla_lane<rot><mode>,
> neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>):
> New.
>
> gcc/testsuite/ChangeLog:
>
> 2018-12-11 Tamar Christina <tamar.christina@arm.com>
>
> * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add
> AArch32 regexpr.
> * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c:
> Likewise.
>
> --
More information about the Gcc-patches
mailing list