This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/19530] MMX load intrinsic produces SSE superfluous instructions (movlps)
- From: "guardia at sympatico dot ca" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 29 Jan 2005 04:47:24 -0000
- Subject: [Bug target/19530] MMX load intrinsic produces SSE superfluous instructions (movlps)
- References: <20050119145614.19530.guardia@sympatico.ca>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Additional Comments From guardia at sympatico dot ca 2005-01-29 04:47 -------
Hum, there apparently seems to be a problem with the optimization stages.. I
cooked up another snippet :
void moo(__m64 i, unsigned int *r)
{
unsigned int tmp = __builtin_ia32_vec_ext_v2si (i, 0);
*r = tmp;
}
With -O0 -mmmx we get:
movd %mm0, -4(%ebp)
movl 8(%ebp), %edx
movl -4(%ebp), %eax
movl %eax, (%edx)
Which with -O3 gets reduced to:
movl 8(%ebp), %eax
movd %mm0, (%eax)
Now, clearly it understands that "movd" is the same as "movl", except they work
on different registers on an MMX only machine. With "movlps" and "movq" it
should do the same I think? If the optimization stages can work this out, maybe
we wouldn't need to rewrite the MMX/SSE1 support...
(BTW, correction, when I said 200+ instructions to schedule, I meant per
function. I have a dozen such functions with 200+ instructions, and it ain't
going to get any smaller)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19530