This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex mutliplication and addition


Hi Tamar,

On 11/12/18 15:46, Tamar Christina wrote:
Hi All,

This patch adds NEON intrinsics and tests for the Armv8.3-a complex
multiplication and add instructions with a rotate along the Argand plane.

The instructions are documented in the ArmARM[1] and the intrinsics specification
will be published on the Arm website [2].

The Lane versions of these instructions are special in that they always select a pair.
using index 0 means selecting lane 0 and 1.  Because of this the range check for the
intrinsics require special handling.

On Arm, in order to implement some of the lane intrinsics we're using the structure of the
register file.  The lane variant of these instructions always select a D register, but the data
itself can be stored in Q registers.  This means that for single precision complex numbers you are
only allowed to select D[0] but using the register file layout you can get the range 0-1 for lane indices
by selecting between Dn[0] and Dn+1[0].

Same reasoning applies for half float complex numbers, except there your D register indexes can be 0 or 1, so you have
a total range of 4 elements (for a V8HF).


[1] https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
[2] https://developer.arm.com/docs/101028/latest

Bootstrapped Regtested on arm-none-gnueabihf and no issues.

Ok for trunk?


Ok.
Thanks,
Kyrill

Thanks,
Tamar

gcc/ChangeLog:

2018-12-11  Tamar Christina  <tamar.christina@arm.com>

        * config/arm/arm-builtins.c
        (enum arm_type_qualifiers): Add qualifier_lane_pair_index.
        (MAC_LANE_PAIR_QUALIFIERS): New.
        (arm_expand_builtin_args): Use it.
        (arm_expand_builtin_1): Likewise.
        * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands): New.
        * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New.
        * config/arm/arm-c.c (arm_cpu_builtins): Add __ARM_FEATURE_COMPLEX.
        * config/arm/arm_neon.h:
        (vcadd_rot90_f16): New.
        (vcaddq_rot90_f16): New.
        (vcadd_rot270_f16): New.
        (vcaddq_rot270_f16): New.
        (vcmla_f16): New.
        (vcmlaq_f16): New.
        (vcmla_lane_f16): New.
        (vcmla_laneq_f16): New.
        (vcmlaq_lane_f16): New.
        (vcmlaq_laneq_f16): New.
        (vcmla_rot90_f16): New.
        (vcmlaq_rot90_f16): New.
        (vcmla_rot90_lane_f16): New.
        (vcmla_rot90_laneq_f16): New.
        (vcmlaq_rot90_lane_f16): New.
        (vcmlaq_rot90_laneq_f16): New.
        (vcmla_rot180_f16): New.
        (vcmlaq_rot180_f16): New.
        (vcmla_rot180_lane_f16): New.
        (vcmla_rot180_laneq_f16): New.
        (vcmlaq_rot180_lane_f16): New.
        (vcmlaq_rot180_laneq_f16): New.
        (vcmla_rot270_f16): New.
        (vcmlaq_rot270_f16): New.
        (vcmla_rot270_lane_f16): New.
        (vcmla_rot270_laneq_f16): New.
        (vcmlaq_rot270_lane_f16): New.
        (vcmlaq_rot270_laneq_f16): New.
        (vcadd_rot90_f32): New.
        (vcaddq_rot90_f32): New.
        (vcadd_rot270_f32): New.
        (vcaddq_rot270_f32): New.
        (vcmla_f32): New.
        (vcmlaq_f32): New.
        (vcmla_lane_f32): New.
        (vcmla_laneq_f32): New.
        (vcmlaq_lane_f32): New.
        (vcmlaq_laneq_f32): New.
        (vcmla_rot90_f32): New.
        (vcmlaq_rot90_f32): New.
        (vcmla_rot90_lane_f32): New.
        (vcmla_rot90_laneq_f32): New.
        (vcmlaq_rot90_lane_f32): New.
        (vcmlaq_rot90_laneq_f32): New.
        (vcmla_rot180_f32): New.
        (vcmlaq_rot180_f32): New.
        (vcmla_rot180_lane_f32): New.
        (vcmla_rot180_laneq_f32): New.
        (vcmlaq_rot180_lane_f32): New.
        (vcmlaq_rot180_laneq_f32): New.
        (vcmla_rot270_f32): New.
        (vcmlaq_rot270_f32): New.
        (vcmla_rot270_lane_f32): New.
        (vcmla_rot270_laneq_f32): New.
        (vcmlaq_rot270_lane_f32): New.
        (vcmlaq_rot270_laneq_f32): New.
        * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0, vcmla90,
        vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180, vcmla_lane270,
        vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270,
        vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270): New.
        * config/arm/neon.md (neon_vcmla_lane<rot><mode>,
        neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>): New.

gcc/testsuite/ChangeLog:

2018-12-11  Tamar Christina  <tamar.christina@arm.com>

        * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add AArch32 regexpr.
        * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: Likewise.

--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]