[hsa 7/12] Disabling the vectorizer for GPU kernels/functions
Martin Jambor
mjambor@suse.cz
Thu Nov 5 22:01:00 GMT 2015
Hi,
in the previous email I wrote we need to "change behavior" of a few
optimization passes. One was the flattening of GPU functions and the
other two are in the patch below. It all comes to that, at the
moment, we need to switch off the vectorizer (only for the GPU
functions, of course).
We are actually quite close to being able to handle gimple vector
input in HSA back-end but not all the way yet, and before allowing the
vectorizer again, we will have to make sure it never produces vectors
bigger than 128bits (in GPU functions).
Thanks,
Martin
2015-11-05 Martin Jambor <mjambor@suse.cz>
* tree-ssa-loop.c: Include cgraph.c, symbol-summary.c and hsa.h.
(pass_vectorize::gate): Do not run on HSA functions.
* tree-vectorizer.c: Include symbol-summary.c and hsa.h.
(pass_slp_vectorize::gate): Do not run on HSA functions.
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 8ecd140..0d119e2 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -35,6 +35,9 @@ along with GCC; see the file COPYING3. If not see
#include "tree-inline.h"
#include "tree-scalar-evolution.h"
#include "tree-vectorizer.h"
+#include "cgraph.h"
+#include "symbol-summary.h"
+#include "hsa.h"
/* A pass making sure loops are fixed up. */
@@ -257,7 +260,8 @@ public:
/* opt_pass methods: */
virtual bool gate (function *fun)
{
- return flag_tree_loop_vectorize || fun->has_force_vectorize_loops;
+ return (flag_tree_loop_vectorize || fun->has_force_vectorize_loops)
+ && !hsa_gpu_implementation_p (fun->decl);
}
virtual unsigned int execute (function *);
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index b80a8dd..366138c 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -75,6 +75,8 @@ along with GCC; see the file COPYING3. If not see
#include "tree-ssa-propagate.h"
#include "dbgcnt.h"
#include "tree-scalar-evolution.h"
+#include "symbol-summary.h"
+#include "hsa.h"
/* Loop or bb location. */
@@ -675,7 +677,10 @@ public:
/* opt_pass methods: */
opt_pass * clone () { return new pass_slp_vectorize (m_ctxt); }
- virtual bool gate (function *) { return flag_tree_slp_vectorize != 0; }
+ virtual bool gate (function *fun)
+ {
+ return flag_tree_slp_vectorize && !hsa_gpu_implementation_p (fun->decl);
+ }
virtual unsigned int execute (function *);
}; // class pass_slp_vectorize
More information about the Gcc-patches
mailing list