This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][1/5] Add param parloops-chunk-size
- From: Tom de Vries <Tom_deVries at mentor dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Date: Mon, 31 Aug 2015 13:45:25 +0200
- Subject: [PATCH][1/5] Add param parloops-chunk-size
- Authentication-results: sourceware.org; auth=none
- References: <552E6341 dot 4040401 at mentor dot com> <55E43D5D dot 5020300 at mentor dot com>
On 31/08/15 13:41, Tom de Vries wrote:
On 15/04/15 15:10, Tom de Vries wrote:
Hi,
This patch series fixes PR65637.
Currently, ssa-handling code in expand_omp_for_static_chunk is dead and
not exercised by testing.
Ssa-handling code in omp-low.c is only triggered by
pass_parallelize_loops, and that pass doesn't specify a chunk size on
the GIMPLE_OMP_FOR it constructs, so that only exercises the
expand_omp_for_static_nochunk path.
Using the attached trigger patch, we excercise the ssa-handling code in
expand_omp_for_static_chunk.
>
> 1. Fix gcc_assert in expand_omp_for_static_chunk
> 2. Fix inner loop phi in expand_omp_for_static_chunk
> 3. Handle 2 preds for fin_bb in expand_omp_for_static_chunk
I'm posting an updated series.
1. Add param parloops-chunk-size
2. Handle simple latch bb in expand_omp_for_static_chunk
3. Fix gcc_assert in expand_omp_for_static_chunk
4. Fix inner loop phi in expand_omp_for_static_chunk
5. Handle 2 preds for fin_bb in expand_omp_for_static_chunk
There are two new patches, (1) and (2) in the new numbering.
The first patch adds a param parloops-chunk-size, which means the
ssa-handling code in expand_omp_for_static_chunk is no longer dead.
The second patch handles simple latches in expand_omp_for_static_chunk,
similar to the fix for PR66846 in expand_omp_for_static_nochunk.
The rest of the patches are now updated to include the testcases (and
patch number 4 has been updated to handle simple latches).
The patch series has been bootstrapped and reg-tested on x86_64.
I'll post the patches from the patch series individually. The first two
in response to this email, the latter three in response to the earlier
submissions.
Hi,
this patch adds a param parloops-chunk-size.
The param is used to set the chunk-size of the schedule of omp-for loops
generated by parloops.
Thanks,
- Tom
Add param parloops-chunk-size
2015-08-31 Tom de Vries <tom@codesourcery.com>
* doc/invoke.texi (parloops-chunk-size): Add item.
* params.def (PARAM_PARLOOPS_CHUNK_SIZE): Add DEFPARAM.
* tree-parloops.c: Include params.h.
(create_parallel_loop): Set chunk-size of schedule of omp-for loop, if
param parloops-chunk-size is used.
---
gcc/doc/invoke.texi | 4 ++++
gcc/params.def | 5 +++++
gcc/tree-parloops.c | 5 +++++
3 files changed, 14 insertions(+)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0ec0fd..6dd144d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11000,6 +11000,10 @@ path. The default is 10.
Maximum number of new jump thread paths to create for a finite state
automaton. The default is 50.
+@item parloops-chunk-size
+Chunk size of omp schedule for loops parallelized by parloops. The default
+is 0.
+
@end table
@end table
diff --git a/gcc/params.def b/gcc/params.def
index c8b3a90..11238cb 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1135,6 +1135,11 @@ DEFPARAM (PARAM_MAX_FSM_THREAD_PATHS,
"max-fsm-thread-paths",
"Maximum number of new jump thread paths to create for a finite state automaton",
50, 1, 999999)
+
+DEFPARAM (PARAM_PARLOOPS_CHUNK_SIZE,
+ "parloops-chunk-size",
+ "Chunk size of omp schedule for loops parallelized by parloops",
+ 0, 0, 0)
/*
Local variables:
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index d017479..c164121 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-nested.h"
#include "cgraph.h"
#include "tree-ssa.h"
+#include "params.h"
/* This pass tries to distribute iterations of loops into several threads.
The implementation is straightforward -- for each loop we test whether its
@@ -2092,6 +2093,10 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
type = TREE_TYPE (cvar);
t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+ int chunk_size = PARAM_VALUE (PARAM_PARLOOPS_CHUNK_SIZE);
+ if (chunk_size != 0)
+ OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t)
+ = build_int_cst (integer_type_node, chunk_size);
for_stmt = gimple_build_omp_for (NULL, GF_OMP_FOR_KIND_FOR, t, 1, NULL);
gimple_set_location (for_stmt, loc);
--
1.9.1