This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args
- From: "scovich at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 11 Jul 2007 20:27:07 -0000
- Subject: [Bug target/32661] __builtin_ia32_vec_ext suboptimal for pointer/ref args
- References: <bug-32661-14600@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #6 from scovich at gmail dot com 2007-07-11 20:27 -------
(In reply to comment #5)
> SImode moves will be a bit harder, because shufps insn pattern is involved in
> the vector expansion.
IIRC, shufps takes 3 cycles on Core2
(http://www.agner.org/optimize/instruction_tables.pdf), even without the
operand type mismatch (does that still exist?). That's >=4 cycles.
Storing the vector to stack and load the desired entry would take <=4 cycles,
even without Intel's store-load optimizations, and I imagine the optimizer
would be able to deal with it better.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32661