Hi Delia,
On 2/19/20 5:25 PM, Delia Burduv wrote:
Hi,
Here is the latest version of the patch. It just has some minor
formatting changes that were brought up by Richard Sandiford in the
AArch64 patches
Thanks,
Delia
On 1/22/20 5:29 PM, Delia Burduv wrote:
> Ping.
>
> I will change the tests to use the exact input and output registers as
> Richard Sandiford suggested for the AArch64 patches.
>
> On 12/20/19 6:46 PM, Delia Burduv wrote:
>> This patch adds the ARMv8.6 ACLE BFloat16 store intrinsics
>> vst<n>{q}_bf16 as part of the BFloat16 extension.
>>
(https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics)
>>
>> The intrinsics are declared in arm_neon.h .
>> A new test is added to check assembler output.
>>
>> This patch depends on the Arm back-end patche.
>> (https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html)
>>
>> Tested for regression on arm-none-eabi and armeb-none-eabi. I don't
>> have commit rights, so if this is ok can someone please commit it
for me?
>>
>> gcc/ChangeLog:
>>
>> 2019-11-14 Delia Burduv <delia.burduv@arm.com>
>>
>> * config/arm/arm_neon.h (bfloat16_t): New typedef.
>> (bfloat16x4x2_t): New typedef.
>> (bfloat16x8x2_t): New typedef.
>> (bfloat16x4x3_t): New typedef.
>> (bfloat16x8x3_t): New typedef.
>> (bfloat16x4x4_t): New typedef.
>> (bfloat16x8x4_t): New typedef.
>> (vst2_bf16): New.
>> (vst2q_bf16): New.
>> (vst3_bf16): New.
>> (vst3q_bf16): New.
>> (vst4_bf16): New.
>> (vst4q_bf16): New.
>> * config/arm/arm-builtins.c (E_V2BFmode): New mode.
>> (VAR13): New.
>> (arm_simd_types[Bfloat16x2_t]):New type.
>> * config/arm/arm-modes.def (V2BF): New mode.
>> * config/arm/arm-simd-builtin-types.def
>> (Bfloat16x2_t): New entry.
>> * config/arm/arm_neon_builtins.def
>> (vst2): Changed to VAR13 and added v4bf, v8bf
>> (vst3): Changed to VAR13 and added v4bf, v8bf
>> (vst4): Changed to VAR13 and added v4bf, v8bf
>> * config/arm/iterators.md (VDXBF): New iterator.
>> (VQ2BF): New iterator.
>> (V_elem): Added V4BF, V8BF.
>> (V_sz_elem): Added V4BF, V8BF.
>> (V_mode_nunits): Added V4BF, V8BF.
>> (q): Added V4BF, V8BF.
>> *config/arm/neon.md (vst2): Used new iterators.
>> (vst3): Used new iterators.
>> (vst3qa): Used new iterators.
>> (vst3qb): Used new iterators.
>> (vst4): Used new iterators.
>> (vst4qa): Used new iterators.
>> (vst4qb): Used new iterators.
>>
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-11-14 Delia Burduv <delia.burduv@arm.com>
>>
>> * gcc.target/arm/simd/bf16_vstn_1.c: New test.
One thing I just noticed in this and the other arm bfloat16 patches...
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index
3c78f435009ab027f92693d00ab5b40960d5419d..fd81c18948db3a7f6e8e863d32511f75bf950e6a
100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -18742,6 +18742,89 @@ vcmlaq_rot270_laneq_f32 (float32x4_t __r,
float32x4_t __a, float32x4_t __b,
return __builtin_neon_vcmla_lane270v4sf (__r, __a, __b, __index);
}
+#pragma GCC push_options
+#pragma GCC target ("arch=armv8.2-a+bf16")
+
+typedef struct bfloat16x4x2_t
+{
+ bfloat16x4_t val[2];
+} bfloat16x4x2_t;
These should be in a new arm_bf16.h file that gets included in the
main arm_neon.h file, right?
I believe the aarch64 versions are implemented that way.
Otherwise the patch looks good to me.
Thanks!
Kyrill
+
+typedef struct bfloat16x8x2_t
+{
+ bfloat16x8_t val[2];
+} bfloat16x8x2_t;
+