[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8
lili.cui at intel dot com
gcc-bugzilla@gcc.gnu.org
Fri Jun 9 11:11:13 GMT 2023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
cuilili <lili.cui at intel dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |lili.cui at intel dot com
--- Comment #2 from cuilili <lili.cui at intel dot com> ---
The commit changed the break dependency chain function, in order to generate
more FMA. S242 has a chain that needs to be broken. The chain is in a small
loop and related with the loop reduction variable a[i-1].
Src code:
for (int i = 1; i < LEN_1D; ++i)
{
a[i] = a[i - 1] + s1 + s2 + b[i] + c[i] + d[i];
}
------------------------------------------------------
Base version:
SSA tree
ssa1 = (s1+s2) + b[i];
ssa2 = c[i] + d[i];
ssa3 = ssa1+ssa2;
ssa4 = ssa3 + a[i-1]
a[i-1] uses xmm1, there are 2 instructions using xmm0 have dependencies across
iterations
Assembler
Loop1:
vmovsd 0x60c400(%rax),%xmm0
vaddsd 0x60b000(%rax),%xmm3,%xmm2
add $0x8,%rax
vaddsd 0x60b9f8(%rax),%xmm0,%xmm0
vaddsd %xmm2,%xmm0,%xmm0
vaddsd %xmm0,%xmm1,%xmm1 ---> 1
vmovsd %xmm1,0x60cdf8(%rax) ---> 2
cmp $0xa00,%rdx
jne Loop1
--------------------------------------------------------------
Base + commit g:e5405f065bace0685cb3b8878d1dfc7a6e7ef409 version:
a[i-1] uses xmm0, there are 4 instructions using xmm0 have dependencies across
iterations
SSA tree
ssa1 = (s1+s2) + b[i];
ssa2 = c[i] + d[i];
ssa3 = ssa1 + a[i-1]
ssa3 = ssa2 + ssa3;
Assembler
Loop1:
vaddsdq 0x60b000(%rax), %xmm0, %xmm0 ---> 1
vmovsdq 0x60c400(%rax), %xmm1
add $0x8, %rax
vaddsdq 0x60b9f8(%rax), %xmm1, %xmm1
vaddsd %xmm2, %xmm0, %xmm0 ---> 2
vaddsd %xmm1, %xmm0, %xmm0 ---> 3
vmovsdq %xmm0, 0x60cdf8(%rax) ---> 4
cmp $0xa00,%rdx
jne Loop1
More information about the Gcc-bugs
mailing list