This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
PATCH: PR target/49002: 128-bit AVX load incorrectly becomes 256-bit AVX load
- From: "H.J. Lu" <hongjiu dot lu at intel dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Uros Bizjak <ubizjak at gmail dot com>, Kirill Yukhin <kirill dot yukhin at intel dot com>
- Date: Wed, 18 May 2011 13:37:23 -0700
- Subject: PATCH: PR target/49002: 128-bit AVX load incorrectly becomes 256-bit AVX load
- Reply-to: "H.J. Lu" <hjl dot tools at gmail dot com>
Hi,
This patch properly handles 256bit load cast. OK for trunk if there
is no regression? I will also prepare a patch for 4.6 branch.
Thanks.
H.J.
--
gcc/
2011-05-18 H.J. Lu <hongjiu.lu@intel.com>
PR target/49002
* config/i386/sse.md (avx_<ssemodesuffix><avxsizesuffix>_<ssemodesuffix>):
Properly handle load cast.
gcc/testsuite/
2011-05-18 H.J. Lu <hongjiu.lu@intel.com>
PR target/49002
* gcc.target/i386/pr49002-1.c: New test.
* gcc.target/i386/pr49002-2.c: Likewise.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 291bffb..cf12a6d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -10294,12 +10294,13 @@
"&& reload_completed"
[(const_int 0)]
{
+ rtx op0 = operands[0];
rtx op1 = operands[1];
- if (REG_P (op1))
+ if (REG_P (op0))
+ op0 = gen_rtx_REG (<ssehalfvecmode>mode, REGNO (op0));
+ else
op1 = gen_rtx_REG (<MODE>mode, REGNO (op1));
- else
- op1 = gen_lowpart (<MODE>mode, op1);
- emit_move_insn (operands[0], op1);
+ emit_move_insn (op0, op1);
DONE;
})
diff --git a/gcc/testsuite/gcc.target/i386/pr49002-1.c b/gcc/testsuite/gcc.target/i386/pr49002-1.c
new file mode 100644
index 0000000..7553e82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr49002-1.c
@@ -0,0 +1,16 @@
+/* PR target/49002 */
+/* { dg-do compile } */
+/* { dg-options "-O -mavx" } */
+
+#include <immintrin.h>
+
+void foo(const __m128d *from, __m256d *to, int s)
+{
+ __m256d var = _mm256_castpd128_pd256(from[0]);
+ var = _mm256_insertf128_pd(var, from[s], 1);
+ to[0] = var;
+}
+
+/* Ensure we load into xmm, not ymm. */
+/* { dg-final { scan-assembler-not "vmovapd\[\t \]*\[^,\]*,\[\t \]*%ymm" } } */
+/* { dg-final { scan-assembler "vmovapd\[\t \]*\[^,\]*,\[\t \]*%xmm" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr49002-2.c b/gcc/testsuite/gcc.target/i386/pr49002-2.c
new file mode 100644
index 0000000..b0e1009
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr49002-2.c
@@ -0,0 +1,14 @@
+/* PR target/49002 */
+/* { dg-do compile } */
+/* { dg-options "-O -mavx" } */
+
+#include <immintrin.h>
+
+void foo(const __m128d from, __m256d *to)
+{
+ *to = _mm256_castpd128_pd256(from);
+}
+
+/* Ensure we store ymm, not xmm. */
+/* { dg-final { scan-assembler-not "vmovapd\[\t \]*%xmm\[0-9\]\+,\[^,\]*" } } */
+/* { dg-final { scan-assembler "vmovapd\[\t \]*%ymm\[0-9\]\+,\[^,\]*" } } */