[ARM] PR66791: Replace builtins in vld1

Prathamesh Kulkarni prathamesh.kulkarni@linaro.org
Thu Jul 29 14:45:16 GMT 2021


On Thu, 29 Jul 2021 at 14:57, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com> wrote:
>
> Hi Prathamesh,
>
> > -----Original Message-----
> > From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
> > Sent: 26 July 2021 22:24
> > To: gcc Patches <gcc-patches@gcc.gnu.org>; Kyrylo Tkachov
> > <Kyrylo.Tkachov@arm.com>; Richard Earnshaw
> > <Richard.Earnshaw@foss.arm.com>
> > Subject: [ARM] PR66791: Replace builtins in vld1
> >
> > Hi,
> > Similar to aarch64, this patch replaces call to builtin by
> > dereferencing __a in vld1_p64, vld1_s64 and vld1_u64.
> >
> > The patch changes code-gen for the intrinsic as follows:
> > Before patch:
> >         vld1.64 {d16}, [r0:64]
> >         vmov    r0, r1, d16     @ int
> >         bx      lr
> >
> > After patch:
> >         ldrd    r0, [r0]
> >         bx      lr
> >
> > I assume the code-gen after patch is correct, since it loads two
> > consecutive words from [r0] into r0 and r1 ?
>
> Yes, this looks correct.
>
> >
> > Bootstrapped+tested on arm-linux-gnueabihf.
> > OK to commit ?
>
> Ok. Can we now remove the vld1 builtin definition?
Does the attached patch look OK ?
I suppose we can only remove entry for di since the patch replaces
calls to only __builtin_neon_vld1di ?

Thanks,
Prathamesh
> Thanks,
> Kyrill
>
> >
> > Thanks,
> > Prathamesh
-------------- next part --------------
gcc/ChangeLog:

	PR target/66791
	* config/arm/arm_neon.h (vld1_p64): Replace call to builtin by
	explicitly dereferencing __a.
	(vld1_s64): Likewise.
	(vld1_u64): Likewise.
	* config/arm/arm_neon_builtins.def (vld1): Remove entry for di
	and change to VAR13.

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 41b596b5fc6..5a91d15bf75 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -10301,7 +10301,7 @@ __extension__ extern __inline poly64x1_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vld1_p64 (const poly64_t * __a)
 {
-  return (poly64x1_t)__builtin_neon_vld1di ((const __builtin_neon_di *) __a);
+  return (poly64x1_t) { *__a };
 }
 
 #pragma GCC pop_options
@@ -10330,7 +10330,7 @@ __extension__ extern __inline int64x1_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vld1_s64 (const int64_t * __a)
 {
-  return (int64x1_t)__builtin_neon_vld1di ((const __builtin_neon_di *) __a);
+  return (int64x1_t) { *__a };
 }
 
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
@@ -10374,7 +10374,7 @@ __extension__ extern __inline uint64x1_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vld1_u64 (const uint64_t * __a)
 {
-  return (uint64x1_t)__builtin_neon_vld1di ((const __builtin_neon_di *) __a);
+  return (uint64x1_t) { *__a };
 }
 
 __extension__ extern __inline poly8x8_t
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index 70438ac1848..fb6d66e594a 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -302,8 +302,8 @@ VAR1 (TERNOP, vtbx1, v8qi)
 VAR1 (TERNOP, vtbx2, v8qi)
 VAR1 (TERNOP, vtbx3, v8qi)
 VAR1 (TERNOP, vtbx4, v8qi)
-VAR14 (LOAD1, vld1,
-        v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di,
+VAR13 (LOAD1, vld1,
+        v8qi, v4hi, v4hf, v2si, v2sf, v16qi, v8hi, v8hf, v4si, v4sf, v2di,
         v4bf, v8bf)
 VAR12 (LOAD1LANE, vld1_lane,
 	v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di, v4bf, v8bf)


More information about the Gcc-patches mailing list