[Bug c/79359] Squaring a complex float gives inefficient code with or without -ffast-math

drraph at gmail dot com gcc-bugzilla@gcc.gnu.org
Sun Feb 5 20:36:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79359

--- Comment #1 from Raphael C <drraph at gmail dot com> ---
In case it's of any help, here is an explanation of the assembly that ICC gives
with -fp-model strict.

R = real and C = complex. Here "x" just means don't know or unused.

We start with xmm0 = {x, x, C, R}.

The desired output is (R+iC)^2= R^2 + 2RCi - C^2

vmovsldup xmm1, xmm0             # xmm1 = { x, x, R, R }
vmovshdup xmm2, xmm0             # xmm2 = { x, x, C, C }
vshufps   xmm3, xmm0, xmm0, 177  # xmm3 = { x, x, R, C }
vmulps    xmm4, xmm1, xmm0       # xmm4 = { x, x, RC, RR }
vmulps    xmm5, xmm2, xmm3       # xmm5 = { x, x, RC, CC }
vaddsubps xmm0, xmm4, xmm5       # xmm0 = { x, x, 2RC, RR-CC }
ret


More information about the Gcc-bugs mailing list