This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fix PR84278
- From: Richard Biener <rguenther at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Date: Thu, 8 Feb 2018 10:47:25 +0100 (CET)
- Subject: [PATCH] Fix PR84278
- Authentication-results: sourceware.org; auth=none
Noticed while (still...) working on PR84038. The vectorizer happily
tries to construct a V4SFmode from two V2SFmode vectors because
there's an optab handler for it. But it failed to check whether
that mode is supported and RTL expansion later uses TYPE_MODE
to get at the element mode which ends up as BLKmode and thus
we go through the stack...
So this makes the vectorizer test targetm.vector_mode_supported_p
as well before making use of such types. In the above case the
vectorizer then resorts to using two DImode scalars instead.
I've verified that's still faster than doing four SFmode scalar
loads despite whatever reformatting penalty that might occur.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
For PR84038 this makes a difference when compiling with
-mprefer-avx128 -fno-vect-cost-model.
Richard.
2018-02-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/84278
* tree-vect-stmts.c (vectorizable_store): When looking for
smaller vector types to perform grouped strided loads/stores
make sure the mode is supported by the target.
(vectorizable_load): Likewise.
* gcc.target/i386/pr84278.c: New testcase.
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c (revision 257477)
+++ gcc/tree-vect-stmts.c (working copy)
@@ -6510,6 +6558,7 @@ vectorizable_store (gimple *stmt, gimple
machine_mode vmode;
if (!mode_for_vector (elmode, group_size).exists (&vmode)
|| !VECTOR_MODE_P (vmode)
+ || !targetm.vector_mode_supported_p (vmode)
|| (convert_optab_handler (vec_extract_optab,
TYPE_MODE (vectype), vmode)
== CODE_FOR_nothing))
@@ -6528,6 +6577,7 @@ vectorizable_store (gimple *stmt, gimple
element size stores. */
if (mode_for_vector (elmode, lnunits).exists (&vmode)
&& VECTOR_MODE_P (vmode)
+ && targetm.vector_mode_supported_p (vmode)
&& (convert_optab_handler (vec_extract_optab,
vmode, elmode)
!= CODE_FOR_nothing))
@@ -7573,6 +7633,7 @@ vectorizable_load (gimple *stmt, gimple_
machine_mode vmode;
if (mode_for_vector (elmode, group_size).exists (&vmode)
&& VECTOR_MODE_P (vmode)
+ && targetm.vector_mode_supported_p (vmode)
&& (convert_optab_handler (vec_init_optab,
TYPE_MODE (vectype), vmode)
!= CODE_FOR_nothing))
@@ -7598,6 +7659,7 @@ vectorizable_load (gimple *stmt, gimple_
element loads of the original vector type. */
if (mode_for_vector (elmode, lnunits).exists (&vmode)
&& VECTOR_MODE_P (vmode)
+ && targetm.vector_mode_supported_p (vmode)
&& (convert_optab_handler (vec_init_optab, vmode, elmode)
!= CODE_FOR_nothing))
{
Index: gcc/testsuite/gcc.target/i386/pr84278.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr84278.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr84278.c (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -msse2" } */
+
+float A[1024];
+float B[1024];
+int s;
+
+void foo(void)
+{
+ int i;
+ for (i = 0; i < 128; i++)
+ {
+ B[i*2+0] = A[i*s+0];
+ B[i*2+1] = A[i*s+1];
+ }
+}
+
+/* { dg-final { scan-assembler-not "\(%.sp\)" } } */