[PATCH v2 14/16]Arm: Add NEON RTL patterns for Complex Addition, Multiply and FMA.
Tamar Christina
tamar.christina@arm.com
Fri Sep 25 14:31:20 GMT 2020
Hi All,
This adds implementation for the optabs for complex additions. With this the
following C code:
void f90 (float complex a[restrict N], float complex b[restrict N],
float complex c[restrict N])
{
for (int i=0; i < N; i++)
c[i] = a[i] + (b[i] * I);
}
generates
f90:
add r3, r2, #1600
.L2:
vld1.32 {q8}, [r0]!
vld1.32 {q9}, [r1]!
vcadd.f32 q8, q8, q9, #90
vst1.32 {q8}, [r2]!
cmp r3, r2
bne .L2
bx lr
instead of
f90:
add r3, r2, #1600
.L2:
vld2.32 {d24-d27}, [r0]!
vld2.32 {d20-d23}, [r1]!
vsub.f32 q8, q12, q11
vadd.f32 q9, q13, q10
vst2.32 {d16-d19}, [r2]!
cmp r3, r2
bne .L2
bx lr
Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* config/arm/iterators.md (rot): Add UNSPEC_VCMLS, UNSPEC_VCMUL and
UNSPEC_VCMUL180.
(rot_op, rotsplit1, rotsplit2, fcmac1, VCMLA_OP, VCMUL_OP): New.
* config/arm/neon.md (cadd<rot><mode>3, cml<fcmac1><rot_op><mode>4,
cmul<rot_op><mode>3): New.
* config/arm/unspecs.md (UNSPEC_VCMUL, UNSPEC_VCMUL180, UNSPEC_VCMLS,
UNSPEC_VCMLS180): New.
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rb13518.patch
Type: text/x-diff
Size: 4936 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200925/b013455d/attachment-0001.bin>
More information about the Gcc-patches
mailing list