This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fix PR61680
- From: Richard Biener <rguenther at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Date: Mon, 7 Jul 2014 16:22:01 +0200 (CEST)
- Subject: [PATCH] Fix PR61680
- Authentication-results: sourceware.org; auth=none
The following fixes PR61680.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
Richard.
2014-07-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/61680
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Handle properly all read-write dependences with group accesses.
* gcc.dg/vect/pr61680.c: New testcase.
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c (revision 212323)
+++ gcc/tree-vect-data-refs.c (working copy)
@@ -375,11 +397,14 @@ vect_analyze_data_ref_dependence (struct
.. = a[i+1];
where we will end up loading { a[i], a[i+1] } once, make
sure that inserting group loads before the first load and
- stores after the last store will do the right thing. */
- if ((STMT_VINFO_GROUPED_ACCESS (stmtinfo_a)
- && GROUP_SAME_DR_STMT (stmtinfo_a))
- || (STMT_VINFO_GROUPED_ACCESS (stmtinfo_b)
- && GROUP_SAME_DR_STMT (stmtinfo_b)))
+ stores after the last store will do the right thing.
+ Similar for groups like
+ a[i] = ...;
+ ... = a[i];
+ a[i+1] = ...;
+ where loads from the group interleave with the store. */
+ if (STMT_VINFO_GROUPED_ACCESS (stmtinfo_a)
+ || STMT_VINFO_GROUPED_ACCESS (stmtinfo_b))
{
gimple earlier_stmt;
earlier_stmt = get_earlier_stmt (DR_STMT (dra), DR_STMT (drb));
Index: gcc/testsuite/gcc.dg/vect/pr61680.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr61680.c (revision 0)
+++ gcc/testsuite/gcc.dg/vect/pr61680.c (working copy)
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+
+#include "tree-vect.h"
+
+double v[4096][4];
+
+__attribute__((noinline, noclone)) void
+bar (double p[][4])
+{
+ int i;
+ double d = 172.0;
+ for (i = 0; i < 4096; i++)
+ {
+ if (p[i][0] != 6.0 || p[i][1] != 6.0 || p[i][2] != 10.0)
+ __builtin_abort ();
+ if (__builtin_fabs (p[i][3] - d) > 0.25)
+ __builtin_abort ();
+ }
+}
+
+__attribute__((noinline, noclone)) void
+foo (void)
+{
+ int i;
+ double w[4096][4], t;
+ for (i = 0; i < 4096; i++)
+ {
+ w[i][0] = v[i][0] + 2.0;
+ w[i][1] = v[i][1] + 1.0;
+ w[i][2] = v[i][2] + 4.0;
+ w[i][3] = (w[i][0] * w[i][0] + w[i][1] * w[i][1] + w[i][2] * w[i][2]);
+ }
+ bar (w);
+}
+
+int
+main ()
+{
+ int i;
+
+ check_vect ();
+
+ for (i = 0; i < 4096; i++)
+ {
+ v[i][0] = 4.0;
+ v[i][1] = 5.0;
+ v[i][2] = 6.0;
+ }
+ foo ();
+ return 0;
+}