[Bug tree-optimization/53090] suboptimal ivopt
amker at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Aug 8 16:22:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53090
amker at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #10 from amker at gcc dot gnu.org ---
Hmm, It's not mentioned at which optimization level the original bug was
reported. I suspect O2 because vect_perm instruction is needed after
vectorization. So current status is:
After ivopt rewriting, we generate below 8 instructions loop at O2
.L14:
movl (%r14,%rax,4), %ecx
movl (%r14,%rdx,4), %esi
movl %esi, (%r14,%rax,4)
movl %ecx, (%r14,%rdx,4)
addq $1, %rax
subq $1, %rdx
cmpl %eax, %edx
jg .L14
It's better than what was reported.
at O3:
.L14:
movdqu (%rsi,%rdx), %xmm2
movdqa (%r12,%rax), %xmm0
pshufd $27, %xmm2, %xmm1
pshufd $27, %xmm0, %xmm0
movaps %xmm1, (%r12,%rax)
addq $16, %rax
movups %xmm0, (%rsi,%rdx)
subq $16, %rdx
cmpq %rax, %rdi
jne .L14
Consider this fixed.
More information about the Gcc-bugs
mailing list