[Bug target/32735] i686 sse2 generates more movdqa than necessary

ubizjak at gmail dot com gcc-bugzilla@gcc.gnu.org
Sat Jul 14 14:04:00 GMT 2007



------- Comment #6 from ubizjak at gmail dot com  2007-07-14 14:04 -------
(In reply to comment #5)

> > This is two more movdqa then the hand-written code in CallSumDeltas3.
> 
>          paddd   %xmm1, %xmm0       (2)
>          movdqa  %xmm0, %xmm1       (2)
>          movdqa  %xmm0, foo1        (1)
>          jne     .L7

(1) is fixed by http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01330.html

(2) it looks like a register allocator should be enhanced to match insn
_output_ to the input that will produce less moves. We are dealing with %0:

  [(set (match_operand:SSEMODEI 0 "register_operand" "=x")
        (plus:SSEMODEI
          (match_operand:SSEMODEI 1 "nonimmediate_operand" "%0")
          (match_operand:SSEMODEI 2 "nonimmediate_operand" "xm")))]

So there is no reason why RA shouldn't match output with most optimal _input_,
producing one insn shorter sequence:

        ...
        cmpl    $100000000, %eax
        movdqa  %xmm0, %xmm1
        pslldq  $8, %xmm1
        paddd   %xmm0, %xmm1        # paddd   %xmm1, %xmm0
                                    # movdqa  %xmm0, %xmm1
        jne     .L7


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2007-07-14 14:04:19
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32735



More information about the Gcc-bugs mailing list