[PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex Addition, Multiply and FMA.

Tamar Christina tamar.christina@arm.com
Fri Sep 25 14:30:26 GMT 2020


Hi All,

This adds implementation for the optabs for complex operations.  With this the
following C code:

  void f90 (float complex a[restrict N], float complex b[restrict N],
	    float complex c[restrict N])
  {
    for (int i=0; i < N; i++)
      c[i] = a[i] + (b[i] * I);
  }

generates

  f90:
	  mov     x3, 0
	  mov     x4, 400
	  ptrue   p1.b, all
	  whilelo p0.s, xzr, x4
	  .p2align 3,,7
  .L2:
	  ld1w    z0.s, p0/z, [x0, x3, lsl 2]
	  ld1w    z1.s, p0/z, [x1, x3, lsl 2]
	  fcadd   z0.s, p1/m, z0.s, z1.s, #90
	  st1w    z0.s, p0, [x2, x3, lsl 2]
	  incw    x3
	  whilelo p0.s, x3, x4
	  b.any   .L2
	  ret

instead of

  f90:
	  mov     x3, 0
	  mov     x4, 0
	  mov     w5, 200
	  whilelo p0.s, wzr, w5
	  .p2align 3,,7
  .L2:
	  ld2w    {z4.s - z5.s}, p0/z, [x0, x3, lsl 2]
	  ld2w    {z2.s - z3.s}, p0/z, [x1, x3, lsl 2]
	  fsub    z0.s, z4.s, z3.s
	  fadd    z1.s, z2.s, z5.s
	  st2w    {z0.s - z1.s}, p0, [x2, x3, lsl 2]
	  incw    x4
	  inch    x3
	  whilelo p0.s, w4, w5
	  b.any   .L2
	  ret

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* config/aarch64/aarch64-sve.md (cadd<rot><mode>3,
	cml<fcmac1><rot_op><mode>4, cmul<rot_op><mode>3): New.
	* config/aarch64/iterators.md (sve_rot1, sve_rot2): New.

-- 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rb13515.patch
Type: text/x-diff
Size: 4647 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200925/83430fae/attachment.bin>


More information about the Gcc-patches mailing list