[Bug target/19235] [4.0 regression] GCC generates SSE2 instructions for AthlonXP which doesn't support them.

drab at kepler dot fjfi dot cvut dot cz gcc-bugzilla@gcc.gnu.org
Tue Jan 4 13:51:00 GMT 2005

------- Additional Comments From drab at kepler dot fjfi dot cvut dot cz  2005-01-04 13:51 -------
(In reply to comment #10)
> Looking at the Intel reference documentation available from
> Pentium4/manuals/25366614.pdf MOVQ has the following opcodes:
> 0F 6F /r MOVQ mm, mm/m64 Move quadword from mm/m64 to mm. 
> 0F 7F /r MOVQ mm/m64, mm Move quadword from mm to mm/m64. 
> F3 0F 7E MOVQ xmm1, xmm2/m64 Move quadword from xmm2/mem64 to xmm1.
> 66 0F D6 MOVQ xmm2/m64, xmm1 Move quadword from xmm1 to xmm2/mem64.
> and since the two latter instructions are unsupported on AMD and Pentium III
you would need some 
> other way to move data between the xmm registers and memory.

Those 0F 6F and 0F 7F are, however, standard MMX instructions. So when you use
for instance -msse -mfpmath=sse -no-mmx those shouldn't be used as well (don't
know why would anybody want to do that, but...). However when it is used only
for copying (as in the example, that I porposed), there are other ways, such as
using the following instructions:

0F 12 /r MOVLPS xmm, mem64
0F 13 /r MOVLPS mem64, xmm

and even more

0F 16 /r MOVHPS xmm, mem64
0F 17 /r MOVHPS mem64, xmm

It's true, that those are used for two single precision floats moving (into
lower or higher half of the xmm reg.), but since it's only moving, it doesn't
matter, because it just copies those 64bits into either lower or upper 64 bits
of the xmm register. These could come quite handy, since it leaves the mmx/st
registers available for other usage and when we consider only 64bit memory
accesses, then it effectively adds doule the amount of xmm registers as
additional 64bit registers. I think that might be worth considering, isn't it?
And it is SSE only, so AthlonXP, PIII and others might benefit out of it.



More information about the Gcc-bugs mailing list