[Bug target/32735] i686 sse2 generates more movdqa than necessary
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Sat Jul 14 14:04:00 GMT 2007
------- Comment #6 from ubizjak at gmail dot com 2007-07-14 14:04 -------
(In reply to comment #5)
> > This is two more movdqa then the hand-written code in CallSumDeltas3.
>
> paddd %xmm1, %xmm0 (2)
> movdqa %xmm0, %xmm1 (2)
> movdqa %xmm0, foo1 (1)
> jne .L7
(1) is fixed by http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01330.html
(2) it looks like a register allocator should be enhanced to match insn
_output_ to the input that will produce less moves. We are dealing with %0:
[(set (match_operand:SSEMODEI 0 "register_operand" "=x")
(plus:SSEMODEI
(match_operand:SSEMODEI 1 "nonimmediate_operand" "%0")
(match_operand:SSEMODEI 2 "nonimmediate_operand" "xm")))]
So there is no reason why RA shouldn't match output with most optimal _input_,
producing one insn shorter sequence:
...
cmpl $100000000, %eax
movdqa %xmm0, %xmm1
pslldq $8, %xmm1
paddd %xmm0, %xmm1 # paddd %xmm1, %xmm0
# movdqa %xmm0, %xmm1
jne .L7
--
ubizjak at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2007-07-14 14:04:19
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32735
More information about the Gcc-bugs
mailing list