This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: ACLE intrinsics: BFloat16 store (vst<n>{q}_bf16) intrinsics for AArch32

From: Delia Burduv <delia dot burduv at arm dot com>
To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Cc: "nickc at redhat dot com" <nickc at redhat dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
Date: Tue, 3 Mar 2020 16:20:50 +0000
Subject: Re: ACLE intrinsics: BFloat16 store (vst<n>{q}_bf16) intrinsics for AArch32
Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ntBp+8XyvEHovss2egi0r9Se//dKyfsXcHN0CUF9zOQ=; b=Gl1FCrAHOGG6pOXtgWyge1JIbUWyt/gP+tQWbSE5PzE/bYPBuxCfHFzubIy9iUD+jE7Zeb2GJhEc6mMfCrcT3j1LHDdwihMqrvnzBmEe4V2DDvNlnqzPh3hvlB43G/0Ng+fgms0PVz0X1lqOZJbqFDQY47gv2e60HQsTLpdpQwFk63FISFNswW3I46RaiyZBUr2n08WbZhtmca+NkmgvtgN+mCmk2q+n1x4QmaDckbu7UQflnlvTe8J83Qt9OFrj8hrkXmSDY3EcV43taSn4JYb04gqcxyW+622GqGEL3/0GIYXHqrkittEnUdhpBbTkYOfBLYv4b7efmAzxiW7AIg==
Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZvF24Pfhy7aGTTtL1+WmhcfctgH7GOzWS9BzJReJUC9AWvtzVS2mp4w14Et14Ffd+OUzalWZNKn2DI8qcMyDVT3a580kd4+2Bvv80d6r9VV8EY3g0n5yLOy8O9lQSw5pcwAWZwA3KxyqvpmCfowjh6KN+QCSRsVZcCNyR4EG+eT0llFi6cBHhtL3YyEVfkLHegCDHx+Mn6abafMPXz1WOrTrIMRvzQOHo8grPcx3qUAkBt/XLbWrb7bpg/oULwwz4MDyRPMbeJy0w6YtxjKDXKaesq7IDvsiri1vHXuZ8R0h+QxH/VZEE9EbMVKCN9Z29axJ14DzHEepDY1jnONVnQ==
Original-authentication-results: spf=none (sender IP is ) smtp.mailfrom=Delia dot Burduv at arm dot com;
References: <fb8d8bd6-f2ea-9990-617c-1b543d8d07e3@arm.com> <64739707-7f91-103d-2df0-808629e497b5@arm.com> <4abeeb45-5ca5-11f1-c152-23f48f001a99@arm.com> <8b122d23-fc3f-c12e-97e0-cbd87ae3301b@foss.arm.com> <3d87d85e-18f2-06a7-75e1-88f7dcbb454a@arm.com>

Hi,

I made a mistake in the previous patch. This is the latest version.Please let me know if it is ok.


Thanks,
Delia

On 2/21/20 3:18 PM, Delia Burduv wrote:

Hi Kyrill,

The arm_bf16.h is only used for scalar operations. That is how theaarch64 versions are implemented too.


Thanks,
Delia

On 2/21/20 2:06 PM, Kyrill Tkachov wrote:

Hi Delia,

On 2/19/20 5:25 PM, Delia Burduv wrote:

Hi,

Here is the latest version of the patch. It just has some minor
formatting changes that were brought up by Richard Sandiford in the
AArch64 patches

Thanks,
Delia

On 1/22/20 5:29 PM, Delia Burduv wrote:
> Ping.
>
> I will change the tests to use the exact input and output registers as
> Richard Sandiford suggested for the AArch64 patches.
>
> On 12/20/19 6:46 PM, Delia Burduv wrote:
>> This patch adds the ARMv8.6 ACLE BFloat16 store intrinsics
>> vst<n>{q}_bf16 as part of the BFloat16 extension.

>>(https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics)

>>
>> The intrinsics are declared in arm_neon.h .
>> A new test is added to check assembler output.
>>
>> This patch depends on the Arm back-end patche.
>> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html)
>>
>> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't

>> have commit rights, so if this is ok can someone please commit itfor me?

>>
>> gcc/ChangeLog:
>>
>> 2019-11-14  Delia Burduv <delia.burduv@arm.com>
>>
>>      * config/arm/arm_neon.h (bfloat16_t): New typedef.
>>          (bfloat16x4x2_t): New typedef.
>>          (bfloat16x8x2_t): New typedef.
>>          (bfloat16x4x3_t): New typedef.
>>          (bfloat16x8x3_t): New typedef.
>>          (bfloat16x4x4_t): New typedef.
>>          (bfloat16x8x4_t): New typedef.
>>          (vst2_bf16): New.
>>      (vst2q_bf16): New.
>>      (vst3_bf16): New.
>>      (vst3q_bf16): New.
>>      (vst4_bf16): New.
>>      (vst4q_bf16): New.
>>          * config/arm/arm-builtins.c (E_V2BFmode): New mode.
>>          (VAR13): New.
>>          (arm_simd_types[Bfloat16x2_t]):New type.
>>          * config/arm/arm-modes.def (V2BF): New mode.
>>          * config/arm/arm-simd-builtin-types.def
>>          (Bfloat16x2_t): New entry.
>>          * config/arm/arm_neon_builtins.def
>>          (vst2): Changed to VAR13 and added v4bf, v8bf
>>          (vst3): Changed to VAR13 and added v4bf, v8bf
>>          (vst4): Changed to VAR13 and added v4bf, v8bf
>>          * config/arm/iterators.md (VDXBF): New iterator.
>>          (VQ2BF): New iterator.
>>          (V_elem): Added V4BF, V8BF.
>>          (V_sz_elem): Added V4BF, V8BF.
>>          (V_mode_nunits): Added V4BF, V8BF.
>>          (q): Added V4BF, V8BF.
>>          *config/arm/neon.md (vst2): Used new iterators.
>>          (vst3): Used new iterators.
>>          (vst3qa): Used new iterators.
>>          (vst3qb): Used new iterators.
>>          (vst4): Used new iterators.
>>          (vst4qa): Used new iterators.
>>          (vst4qb): Used new iterators.
>>
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-11-14  Delia Burduv <delia.burduv@arm.com>
>>
>>      * gcc.target/arm/simd/bf16_vstn_1.c: New test.


One thing I just noticed in this and the other arm bfloat16 patches...

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h

index3c78f435009ab027f92693d00ab5b40960d5419d..fd81c18948db3a7f6e8e863d32511f75bf950e6a100644

--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h

@@ -18742,6 +18742,89 @@ vcmlaq_rot270_laneq_f32 (float32x4_t __r,float32x4_t __a, float32x4_t __b,

    return __builtin_neon_vcmla_lane270v4sf (__r, __a, __b, __index);
  }

+#pragma GCC push_options
+#pragma GCC target ("arch=armv8.2-a+bf16")
+
+typedef struct bfloat16x4x2_t
+{
+  bfloat16x4_t val[2];
+} bfloat16x4x2_t;

These should be in a new arm_bf16.h file that gets included in themain arm_neon.h file, right?

I believe the aarch64 versions are implemented that way.

Otherwise the patch looks good to me.
Thanks!
Kyrill


  +
+typedef struct bfloat16x8x2_t
+{
+  bfloat16x8_t val[2];
+} bfloat16x8x2_t;
+

Follow-Ups:
- Re: ACLE intrinsics: BFloat16 store (vst<n>{q}_bf16) intrinsics for AArch32
  - From: Delia Burduv

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]