This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PING^3] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Jiangjiji <jiangjiji at huawei dot com>
- Cc: Kyrylo Tkachov <Kyrylo dot Tkachov at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, "Yangfei (Felix)" <felix dot yang at huawei dot com>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Tue, 5 May 2015 07:37:15 +0100
- Subject: Re: [PING^3] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
- Authentication-results: sourceware.org; auth=none
- References: <B34C25384B9D7A428FF276D3AE7D6BDC7B4FE913 at nkgeml511-mbx dot china dot huawei dot com> <55017C2F dot 6080302 at arm dot com> <B34C25384B9D7A428FF276D3AE7D6BDC7B50CDC6 at nkgeml511-mbx dot china dot huawei dot com>
On Sat, Apr 11, 2015 at 11:37:47AM +0100, Jiangjiji wrote:
> Hi,
> This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00772.html
> Regtested with aarch64-linux-gnu on QEMU.
> This patch has no regressions for aarch64_be-linux-gnu big-endian target too.
> OK for the trunk?
>
> Thanks.
> Jiang jiji
>
>
> ----------
> Re: [PING^2] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
>
> Hi, Kyrill
> Thank you for your suggestion.
> I fixed it and regtested with aarch64-linux-gnu on QEMU.
> This patch has no regressions for aarch64_be-linux-gnu big-endian target too.
> OK for the trunk?
Hi Jiang,
I'm sorry that I've taken so long to get to this, I've been out of office
for several weeks. I have one comment.
> +__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
> +vmul_n_f32 (float32x2_t __a, float32_t __b)
> +{
> + return __builtin_aarch64_mul_nv2sf (__a, __b);
> +}
> +
For vmul_n_* intrinsics, is there a reason we don't want to use the
GCC vector extension syntax to allow us to write these as:
__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
vmul_n_f32 (float32x2_t __a, float32_t __b)
{
return __a * __b;
}
It would be great if we could make that work.
Thanks,
James