This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/54349] _mm_cvtsi128_si64 unnecessary stores value at stack
- From: "neleai at seznam dot cz" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 27 Apr 2013 01:06:45 +0000
- Subject: [Bug target/54349] _mm_cvtsi128_si64 unnecessary stores value at stack
- Auto-submitted: auto-generated
- References: <bug-54349-4 at http dot gcc dot gnu dot org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
--- Comment #4 from Ondrej Bilka <neleai at seznam dot cz> 2013-04-27 01:06:45 UTC ---
I found that AMD Bulldozer optimization guide states that moves from xmm to
GPR register should be done directly:"
10.4 Moving Data Between General-Purpose and XMM/YMM Registers
When moving data from a GPR to an XMM register, use separate store and load
instructions to move
the data first from the source register to a temporary location in memory and
then from memory into
the destination register, taking the memory latency into account when
scheduling both stages of the
load-store sequence.
When moving data from an XMM register to a general-purpose register, use the
VMOVD instruction.
Whenever possible, use loads and stores of the same data length. (See 6.3,
`Store-to-Load Forwarding
Restrictions" on page 98 for more information.)
"