[Bug optimization/11327] Non-optimal code when using MMX/SSE builtins

Fri Jun 27 17:50:00 GMT 2003

PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11327

dhazeghi at yahoo dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
          Component|c                           |optimization
           Keywords|                            |pessimizes-code

------- Additional Comments From dhazeghi at yahoo dot com  2003-06-27 17:50 -------
Checking your simpler testcase with gcc mainline (20030620), I get:

.L5:
        movq    (%ecx,%eax,8), %mm0
        movq    (%edx,%eax,8), %mm1
        psubusb (%edx,%eax,8), %mm0
        psubusb (%ecx,%eax,8), %mm1
        por     %mm1, %mm0
        pminub  %mm2, %mm0
        pcmpeqb %mm2, %mm0
        movq    %mm0, (%esi,%eax,8)
        incl    %eax
        cmpl    %ebx, %eax
        jne     .L5

This looks a lot like the optimal code you suggested, correct? Would you mind sending an example 
of the better code you'd like to see generated for foo2, and/or trying gcc cvs to see if the problem 
is fixed there? Thanks,

Dara