[PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex Addition, Multiply and FMA.

Tamar Christina Tamar.Christina@arm.com
Sat Nov 14 15:12:04 GMT 2020


ping

> -----Original Message-----
> From: Gcc-patches <gcc-patches-bounces@gcc.gnu.org> On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:30 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw <Richard.Earnshaw@arm.com>; nd <nd@arm.com>;
> Marcus Shawcroft <Marcus.Shawcroft@arm.com>
> Subject: [PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex
> Addition, Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the following C code:
> 
>   void f90 (float complex a[restrict N], float complex b[restrict N],
> 	    float complex c[restrict N])
>   {
>     for (int i=0; i < N; i++)
>       c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> 	  mov     x3, 0
> 	  mov     x4, 400
> 	  ptrue   p1.b, all
> 	  whilelo p0.s, xzr, x4
> 	  .p2align 3,,7
>   .L2:
> 	  ld1w    z0.s, p0/z, [x0, x3, lsl 2]
> 	  ld1w    z1.s, p0/z, [x1, x3, lsl 2]
> 	  fcadd   z0.s, p1/m, z0.s, z1.s, #90
> 	  st1w    z0.s, p0, [x2, x3, lsl 2]
> 	  incw    x3
> 	  whilelo p0.s, x3, x4
> 	  b.any   .L2
> 	  ret
> 
> instead of
> 
>   f90:
> 	  mov     x3, 0
> 	  mov     x4, 0
> 	  mov     w5, 200
> 	  whilelo p0.s, wzr, w5
> 	  .p2align 3,,7
>   .L2:
> 	  ld2w    {z4.s - z5.s}, p0/z, [x0, x3, lsl 2]
> 	  ld2w    {z2.s - z3.s}, p0/z, [x1, x3, lsl 2]
> 	  fsub    z0.s, z4.s, z3.s
> 	  fadd    z1.s, z2.s, z5.s
> 	  st2w    {z0.s - z1.s}, p0, [x2, x3, lsl 2]
> 	  incw    x4
> 	  inch    x3
> 	  whilelo p0.s, w4, w5
> 	  b.any   .L2
> 	  ret
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* config/aarch64/aarch64-sve.md (cadd<rot><mode>3,
> 	cml<fcmac1><rot_op><mode>4, cmul<rot_op><mode>3): New.
> 	* config/aarch64/iterators.md (sve_rot1, sve_rot2): New.
> 
> --


More information about the Gcc-patches mailing list