[PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex Addition, Multiply and FMA.

Kyrylo Tkachov Kyrylo.Tkachov@arm.com
Mon Nov 16 09:58:08 GMT 2020



> -----Original Message-----
> From: Tamar Christina <Tamar.Christina@arm.com>
> Sent: 25 September 2020 15:30
> To: gcc-patches@gcc.gnu.org
> Cc: nd <nd@arm.com>; Richard Earnshaw <Richard.Earnshaw@arm.com>;
> Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov
> <Kyrylo.Tkachov@arm.com>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Subject: [PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex
> Addition, Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the
> following C code:
> 
>   void f90 (float complex a[restrict N], float complex b[restrict N],
> 	    float complex c[restrict N])
>   {
>     for (int i=0; i < N; i++)
>       c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> 	  mov     x3, 0
> 	  .p2align 3,,7
>   .L2:
> 	  ldr     q0, [x0, x3]
> 	  ldr     q1, [x1, x3]
> 	  fcadd   v0.4s, v0.4s, v1.4s, #90
> 	  str     q0, [x2, x3]
> 	  add     x3, x3, 16
> 	  cmp     x3, 1600
> 	  bne     .L2
> 	  ret
> 
> instead of
> 
>   f90:
> 	  add     x3, x1, 1600
> 	  .p2align 3,,7
>   .L2:
> 	  ld2     {v4.4s - v5.4s}, [x0], 32
> 	  ld2     {v2.4s - v3.4s}, [x1], 32
> 	  fsub    v0.4s, v4.4s, v3.4s
> 	  fadd    v1.4s, v5.4s, v2.4s
> 	  st2     {v0.4s - v1.4s}, [x2], 32
> 	  cmp     x3, x1
> 	  bne     .L2
> 	  ret
> 
> It defined a new iterator VALL_ARITH which contains types for which we can
> do
> general arithmetic (excludes bfloat16).
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* config/aarch64/aarch64-simd.md (cadd<rot><mode>3,
> 	cml<fcmac1><rot_op><mode>4, cmul<rot_op><mode>3): New.
> 	* config/aarch64/iterators.md (VALL_ARITH, UNSPEC_FCMUL,
> 	UNSPEC_FCMUL180, UNSPEC_FCMLS, UNSPEC_FCMLS180,
> UNSPEC_CMLS,
> 	UNSPEC_CMLS180, UNSPEC_CMUL, UNSPEC_CMUL180, FCMLA_OP,
> FCMUL_OP, rot_op,
> 	rotsplit1, rotsplit2, fcmac1): New.
> 	(rot): Add UNSPEC_FCMLS, UNSPEC_FCMUL, UNSPEC_FCMUL180.
> 
> --


More information about the Gcc-patches mailing list