[PATCH, v2, OpenMP 5.0] Implement relaxation of implicit map vs. existing device mappings (for mainline trunk)
Chung-Lin Tang
cltang@codesourcery.com
Fri Nov 5 16:51:59 GMT 2021
Hi Jakub,
On 2021/6/24 11:55 PM, Jakub Jelinek wrote:
> On Fri, May 14, 2021 at 09:20:25PM +0800, Chung-Lin Tang wrote:
>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>> index e790f08b23f..69c4a8e0a0a 100644
>> --- a/gcc/gimplify.c
>> +++ b/gcc/gimplify.c
>> @@ -10374,6 +10374,7 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
>> gcc_unreachable ();
>> }
>> OMP_CLAUSE_SET_MAP_KIND (clause, kind);
>> + OMP_CLAUSE_MAP_IMPLICIT_P (clause) = 1;
>> if (DECL_SIZE (decl)
>> && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
>> {
>
> As Thomas mentioned, there is now also OMP_CLAUSE_MAP_IMPLICIT that means
> something different:
> /* Nonzero on map clauses added implicitly for reduction clauses on combined
> or composite constructs. They shall be removed if there is an explicit
> map clause. */
> Having OMP_CLAUSE_MAP_IMPLICIT and OMP_CLAUSE_MAP_IMPLICIT_P would be too
> confusing. So either we need to use just one flag for both purposes or
> have two different flags and find a better name for one of them.
> The former would be possible if no OMP_CLAUSE_MAP clauses added by the FEs
> are implicit - then you could clear OMP_CLAUSE_MAP_IMPLICIT in
> gimplify_scan_omp_clauses. I wonder if it is the case though, e.g. doesn't
> your "Improve OpenMP target support for C++ [PR92120 v4]" patch add a lot of
> such implicit map clauses (e.g. the this[:1] and various others)?
I have changed the name to OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P, to signal that
this bit is to be passed to the runtime. Right now its intended to be used by
clauses created by the middle-end, but front-end uses like that for C++ could
be clarified later.
> Also, gimplify_adjust_omp_clauses_1 sometimes doesn't add just one map
> clause, but several, shouldn't those be marked implicit too? And similarly
> it calls lang_hooks.decls.omp_finish_clause which can add even further map
> clauses implicitly, shouldn't those be implicit too (in that case copy
> the flag from the clause it is called on to the extra clauses it adds)?
>
> Also as Thomas mentioned, it should be restricted to non-OpenACC,
> it can check gimplify_omp_ctxp->region_type if it is OpenMP or OpenACC.
Agreed, I've adjusted the patch to only to this implicit setting for OpenMP.
This reduces a lot of the originally needed scan test adjustment for existing OpenACC testcases.
>> @@ -10971,9 +10972,15 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
>> list_p = &OMP_CLAUSE_CHAIN (c);
>> }
>>
>> - /* Add in any implicit data sharing. */
>> + /* Add in any implicit data sharing. Implicit clauses are added at the start
>
> Two spaces after dot in comments.
Done.
>> + of the clause list, but after any non-map clauses. */
>> struct gimplify_adjust_omp_clauses_data data;
>> - data.list_p = list_p;
>> + tree *implicit_add_list_p = orig_list_p;
>> + while (*implicit_add_list_p
>> + && OMP_CLAUSE_CODE (*implicit_add_list_p) != OMP_CLAUSE_MAP)
>> + implicit_add_list_p = &OMP_CLAUSE_CHAIN (*implicit_add_list_p);
>
> Why are the implicit map clauses added first and not last?
As I also explained in the first submission email, due to the processing order,
if implicit classes are added last (and processed last), for example:
#pragma omp target map(tofrom: var.ptr[:N]) map(tofrom: var[implicit])
{
// access of var.ptr[]
}
The explicit var.ptr[:N] will not find anything to map, because the (implicit) map(var) has not been seen yet,
and the assumed array section attachment behavior will fail.
Only an order like: map(tofrom: var[implicit]) map(tofrom: var.ptr[:N]) will the usual assumed behavior show.
And yes, this depends on the new behavior implemented by patch [1], which I still need you to review.
e.g. for map(var.ptr[:N]), the proper behavior should *only* map the array section but NOT the base-pointer.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571195.html
> There is also the OpenMP 5.1 [352:17-22] case which basically says that the
> implicit mappings should be ignored if there are explicit ones on the same
> construct (though, do we really create implicit clauses in that case?).
Implicit clauses do not appear to be created if there's an explicit clause already existing.
>> +#define GOMP_MAP_IMPLICIT (GOMP_MAP_FLAG_SPECIAL_3 \
>> + | GOMP_MAP_FLAG_SPECIAL_4)
>> +/* Mask for entire set of special map kind bits. */
>> +#define GOMP_MAP_FLAG_SPECIAL_BITS (GOMP_MAP_FLAG_SPECIAL_0 \
>> + | GOMP_MAP_FLAG_SPECIAL_1 \
>> + | GOMP_MAP_FLAG_SPECIAL_2 \
>> + | GOMP_MAP_FLAG_SPECIAL_3 \
>> + | GOMP_MAP_FLAG_SPECIAL_4)
...
>> +#define GOMP_MAP_IMPLICIT_P(X) \
>> + (((X) & GOMP_MAP_FLAG_SPECIAL_BITS) == GOMP_MAP_IMPLICIT)
>
> I think here we need to decide with which GOMP_MAP* kinds the implicit
> bit will need to be combined with, with looking forward into what features
> we still need to implement for OpenMP 5.0/5.1 (not aware of anything in 5.2
> that would need special care but perhaps I've missed it).
>
> E.g. for declare mapper those single OMP_CLAUSE_MAPs with the implicit
> bit might need to be split into various smaller ones, map this FIELD_DECL,
> map that other FIELD_DECL, what it points to, etc. Even without declare
> mapper, the spec now says that mapping a structure is as if each member is
> mapped separately and 5.2 is going to say there is a predefined declare
> mapper that does that (which is of course something we don't want under the
> hood, we don't want thousands of maps, but virtually split it on a field by
> field basis, do all the special stuff - e.g. how pointers are to be mapped
> as zero array sections with pointer attachment, how C++ references are to be
> handled etc. and then virtually coalesce all the adjacent fields that have
> the same treatment back). I think that actually means we should defer most
> of the map struct etc. handling we do in gimplify_scan_omp_clauses, instead
> note the explicit map clauses and what they refer to, add implicit ones (and
> set implicit bit on those), then go over all clauses (explicit and
> implicit), do the declare mapper processing, do the sorting of what that
> produces based on base expressions/pointers etc.
> At this point, I'm not sure if GOMP_MAP_IMPLICIT can or can't appear
> together e.g. with GOMP_MAP_STRUCT.
Currently with the place of setting OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P in
gimplify_adjust_omp_clauses_1, only things like GOMP_MAP_TO/FROM/FORCE_PRESENT/etc.
be set.
I already had some trouble designating how GOMP_MAP_IMPLICIT would be encoded
within the current bits. My guess is that with more sophisticated features like
declare mappers, the whole interface probably needs to be extended somewhat.
>> @@ -405,8 +422,24 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
>> static int
>> get_kind (bool short_mapkind, void *kinds, int idx)
>> {
>> - return short_mapkind ? ((unsigned short *) kinds)[idx]
>> - : ((unsigned char *) kinds)[idx];
>> + int val = (short_mapkind
>> + ? ((unsigned short *) kinds)[idx]
>> + : ((unsigned char *) kinds)[idx]);
>> +
>> + if (GOMP_MAP_IMPLICIT_P (val))
>> + val &= ~GOMP_MAP_IMPLICIT;
>
> As the particular bit isn't used for anything right now,
> perhaps just do val &= ~GOMP_MAP_IMPLICIT unconditionally?
GOMP_MAP_IMPLICIT is (GOMP_MAP_FLAG_SPECIAL_3 | GOMP_MAP_FLAG_SPECIAL_4),
and those two bits are also partially used for some other things.
So an specific check should be needed.
> But, on the other side, for !short_mapkind you do not want
> to mask that bit out, only the low 3 bits are the mapping
> type and the upper bits are log2 of alignment, so the
> above for !short_mapkind would in weird way change some
> alignments.
I see, I've added a short_mapkind check in get_kind.
>> + return val;
>> +}
>> +
>> +
>> +static bool
>> +get_implicit (bool short_mapkind, void *kinds, int idx)
>> +{
>> + int val = (short_mapkind
>> + ? ((unsigned short *) kinds)[idx]
>> + : ((unsigned char *) kinds)[idx]);
>> +
>> + return GOMP_MAP_IMPLICIT_P (val);
>
> Similarly can return 0 for !short_mapkind, the compatibility GOMP_target etc.
> APIs will never have those in.
This too added.
Re-tested for trunk without regressions, is this okay now?
Thanks,
Chung-Lin
2021-11-05 Chung-Lin Tang <cltang@codesourcery.com>
include/ChangeLog:
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define special bit macro.
(GOMP_MAP_IMPLICIT): New special map kind bits value.
(GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of
special map kind bits.
(GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds.
gcc/ChangeLog:
* tree.h (OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P): New access macro for
'implicit' bit, using 'base.deprecated_flag' field of tree_node.
* tree-pretty-print.c (dump_omp_clause): Add support for printing
implicit attribute in tree dumping.
* gimplify.c (gimplify_adjust_omp_clauses_1):
Set OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P to 1 if map clause is implicitly
created.
(gimplify_adjust_omp_clauses): Adjust place of adding implicitly created
clauses, from simple append, to starting of list, after non-map clauses.
* omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind
values passed to libgomp for implicit maps.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-implicit-map-1.c: New test.
* c-c++-common/goacc/combined-reduction.c: Adjust scan test pattern.
* c-c++-common/goacc/firstprivate-mappings-1.c: Likewise.
* c-c++-common/goacc/mdc-1.c: Likewise.
* g++.dg/goacc/firstprivate-mappings-1.C: Likewise.
libgomp/ChangeLog:
* target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add
implicit map handling to allow a "superset" existing map as valid case.
(get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value.
(get_implicit): New function to extract implicit status.
(gomp_map_fields_existing): Adjust arguments in calls to
gomp_map_vars_existing, and add uses of get_implicit.
(gomp_map_vars_internal): Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.
-------------- next part --------------
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index d8e4b139349..59e47bf2ade 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -10861,6 +10861,10 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
gcc_unreachable ();
}
OMP_CLAUSE_SET_MAP_KIND (clause, kind);
+ /* Setting of the implicit flag for the runtime is currently disabled for
+ OpenACC. */
+ if ((gimplify_omp_ctxp->region_type & ORT_ACC) == 0)
+ OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (clause) = 1;
if (DECL_SIZE (decl)
&& TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
{
@@ -11476,9 +11480,15 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
list_p = &OMP_CLAUSE_CHAIN (c);
}
- /* Add in any implicit data sharing. */
+ /* Add in any implicit data sharing. Implicit clauses are added at the start
+ of the clause list, but after any non-map clauses. */
struct gimplify_adjust_omp_clauses_data data;
- data.list_p = list_p;
+ tree *implicit_add_list_p = orig_list_p;
+ while (*implicit_add_list_p
+ && OMP_CLAUSE_CODE (*implicit_add_list_p) != OMP_CLAUSE_MAP)
+ implicit_add_list_p = &OMP_CLAUSE_CHAIN (*implicit_add_list_p);
+
+ data.list_p = implicit_add_list_p;
data.pre_p = pre_p;
splay_tree_foreach (ctx->variables, gimplify_adjust_omp_clauses_1, &data);
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 15e4424b0bc..3d58a6d35e6 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13153,6 +13153,19 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
else if (integer_nonzerop (s))
tkind_zero = tkind;
}
+ if (tkind_zero == tkind
+ && OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (c)
+ && (((tkind & GOMP_MAP_FLAG_SPECIAL_BITS)
+ & ~GOMP_MAP_IMPLICIT)
+ == 0))
+ {
+ /* If this is an implicit map, and the GOMP_MAP_IMPLICIT
+ bits are not interfered by other special bit encodings,
+ then turn the GOMP_IMPLICIT_BIT flag on for the runtime
+ to see. */
+ tkind |= GOMP_MAP_IMPLICIT;
+ tkind_zero = tkind;
+ }
break;
case OMP_CLAUSE_FIRSTPRIVATE:
gcc_checking_assert (is_gimple_omp_oacc (ctx->stmt));
diff --git a/gcc/testsuite/c-c++-common/goacc/combined-reduction.c b/gcc/testsuite/c-c++-common/goacc/combined-reduction.c
index ecf23f59d66..74ab05bc856 100644
--- a/gcc/testsuite/c-c++-common/goacc/combined-reduction.c
+++ b/gcc/testsuite/c-c++-common/goacc/combined-reduction.c
@@ -23,7 +23,7 @@ main ()
return 0;
}
-/* { dg-final { scan-tree-dump-times "omp target oacc_parallel reduction.+:v1. map.tofrom:v1" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "omp target oacc_parallel reduction.+:v1. firstprivate.n. map.tofrom:v1" 1 "gimple" } } */
/* { dg-final { scan-tree-dump-times "acc loop reduction.+:v1. private.i." 1 "gimple" } } */
/* { dg-final { scan-tree-dump-times "omp target oacc_kernels map.force_tofrom:n .len: 4.. map.force_tofrom:v1 .len: 4.." 1 "gimple" } } */
/* { dg-final { scan-tree-dump-times "acc loop reduction.+:v1. private.i." 1 "gimple" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/firstprivate-mappings-1.c b/gcc/testsuite/c-c++-common/goacc/firstprivate-mappings-1.c
index 7987beaed9a..5134ef6ed6c 100644
--- a/gcc/testsuite/c-c++-common/goacc/firstprivate-mappings-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/firstprivate-mappings-1.c
@@ -419,12 +419,7 @@ vla (int array_li)
copyout (array_so)
/* The gimplifier has created an implicit 'firstprivate' clause for the array
length.
- { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel map\(from:array_so \[len: 4\]\) firstprivate\(array_li.[0-9]+\)} omplower { target { ! c++ } } } }
- { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel map\(from:array_so \[len: 4\]\) firstprivate\(} omplower { target { c++ } } } }
- (C++ computes an intermediate value, so can't scan for 'firstprivate(array_li)'.) */
- /* For C, non-LP64, the gimplifier has also created a mapping for the array
- itself; PR90859.
- { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel map\(from:array_so \[len: 4\]\) firstprivate\(array_li.[0-9]+\) map\(tofrom:\(\*array.[0-9]+\) \[len: D\.[0-9]+\]\) map\(firstprivate:array \[pointer assign, bias: 0\]\) \[} omplower { target { c && { ! lp64 } } } } } */
+ { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel firstprivate\(array_li.[0-9]+\) map\(from:array_so \[len: 4\]\) \[} omplower } } */
{
array_so = sizeof array;
}
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
index c2b8dc6c880..0a123bec58f 100644
--- a/gcc/testsuite/c-c++-common/goacc/mdc-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
@@ -45,7 +45,7 @@ t1 ()
/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_data map.to:s .len: 32.." 1 "omplower" } } */
/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .bias: 0.." 1 "omplower" } } */
-/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.attach:s.e .bias: 0.. map.tofrom:s .len: 32" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.tofrom:s .len: 32.. map.attach:s.e .bias: 0.." 1 "omplower" } } */
/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_data map.attach:a .bias: 0.." 1 "omplower" } } */
/* { dg-final { scan-tree-dump-times "pragma omp target oacc_exit_data map.detach:a .bias: 0.." 1 "omplower" } } */
/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_data map.to:a .len: 8.." 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/gomp/target-implicit-map-1.c b/gcc/testsuite/c-c++-common/gomp/target-implicit-map-1.c
new file mode 100644
index 00000000000..52944fdc65a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/target-implicit-map-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdump-tree-gimple" } */
+#ifdef __cplusplus
+extern "C"
+#else
+extern
+#endif
+void abort (void);
+
+int
+main (void)
+{
+ #define N 5
+ int array[N][N];
+
+ for (int i = 0; i < N; i++)
+ {
+ #pragma omp target enter data map(alloc: array[i:1][0:N])
+
+ #pragma omp target
+ for (int j = 0; j < N; j++)
+ array[i][j] = i * 10 + j;
+
+ #pragma omp target exit data map(from: array[i:1][0:N])
+ }
+
+ for (int i = 0; i < N; i++)
+ for (int j = 0; j < N; j++)
+ if (array[i][j] != i + j)
+ abort ();
+
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump {#pragma omp target enter data map\(alloc:array\[[^]]+\]\[0\] \[len: [0-9]+\]\)} "gimple" } } */
+
+/* { dg-final { scan-tree-dump {#pragma omp target num_teams.* firstprivate\(i\) map\(tofrom:array \[len: [0-9]+\]\[implicit\]\)} "gimple" } } */
+
+/* { dg-final { scan-tree-dump {#pragma omp target exit data map\(from:array\[[^]]+\]\[0\] \[len: [0-9]+\]\)} "gimple" } } */
diff --git a/gcc/testsuite/g++.dg/goacc/firstprivate-mappings-1.C b/gcc/testsuite/g++.dg/goacc/firstprivate-mappings-1.C
index 1b1badb1a90..99a3bd472f7 100644
--- a/gcc/testsuite/g++.dg/goacc/firstprivate-mappings-1.C
+++ b/gcc/testsuite/g++.dg/goacc/firstprivate-mappings-1.C
@@ -416,7 +416,7 @@ vla (int &array_li)
copyout (array_so)
/* The gimplifier has created an implicit 'firstprivate' clause for the array
length.
- { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel map\(from:array_so \[len: 4\]\) firstprivate\(} omplower } }
+ { dg-final { scan-tree-dump {(?n)#pragma omp target oacc_parallel firstprivate\([^)]+\) map\(from:array_so \[len: 4\]\)} omplower } }
(C++ computes an intermediate value, so can't scan for 'firstprivate(array_li)'.) */
{
array_so = sizeof array;
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 275dc7d8af7..0da85efc104 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -971,6 +971,9 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
spc, flags, false);
pp_right_bracket (pp);
}
+ if (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP
+ && OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (clause))
+ pp_string (pp, "[implicit]");
pp_right_paren (pp);
break;
diff --git a/gcc/tree.h b/gcc/tree.h
index 7542d97ce12..ba974471339 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1689,6 +1689,11 @@ class auto_suppress_location_wrappers
map clause. */
#define OMP_CLAUSE_MAP_IMPLICIT(NODE) \
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.default_def_flag)
+/* Nonzero if this map clause is to be indicated to the runtime as 'implicit',
+ due to being created through implicit data-mapping rules in the middle-end.
+ NOTE: this is different than OMP_CLAUSE_MAP_IMPLICIT. */
+#define OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P(NODE) \
+ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.deprecated_flag)
/* True on an OMP_CLAUSE_USE_DEVICE_PTR with an OpenACC 'if_present'
clause. */
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index ebd08013430..3e42d7123ae 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -40,11 +40,22 @@
#define GOMP_MAP_FLAG_SPECIAL_0 (1 << 2)
#define GOMP_MAP_FLAG_SPECIAL_1 (1 << 3)
#define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3 (1 << 5)
#define GOMP_MAP_FLAG_SPECIAL_4 (1 << 6)
#define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \
| GOMP_MAP_FLAG_SPECIAL_0)
#define GOMP_MAP_DEEP_COPY (GOMP_MAP_FLAG_SPECIAL_4 \
| GOMP_MAP_FLAG_SPECIAL_2)
+/* This value indicates the map was created implicitly according to
+ OpenMP rules. */
+#define GOMP_MAP_IMPLICIT (GOMP_MAP_FLAG_SPECIAL_3 \
+ | GOMP_MAP_FLAG_SPECIAL_4)
+/* Mask for entire set of special map kind bits. */
+#define GOMP_MAP_FLAG_SPECIAL_BITS (GOMP_MAP_FLAG_SPECIAL_0 \
+ | GOMP_MAP_FLAG_SPECIAL_1 \
+ | GOMP_MAP_FLAG_SPECIAL_2 \
+ | GOMP_MAP_FLAG_SPECIAL_3 \
+ | GOMP_MAP_FLAG_SPECIAL_4)
/* Flag to force a specific behavior (or else, trigger a run-time error). */
#define GOMP_MAP_FLAG_FORCE (1 << 7)
@@ -186,6 +197,9 @@ enum gomp_map_kind
#define GOMP_MAP_ALWAYS_P(X) \
(GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
+#define GOMP_MAP_IMPLICIT_P(X) \
+ (((X) & GOMP_MAP_FLAG_SPECIAL_BITS) == GOMP_MAP_IMPLICIT)
+
/* Asynchronous behavior. Keep in sync with
libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t. */
diff --git a/libgomp/target.c b/libgomp/target.c
index 196dba4f08c..dd7f573fea8 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -539,7 +539,7 @@ static inline void
gomp_map_vars_existing (struct gomp_device_descr *devicep,
struct goacc_asyncqueue *aq, splay_tree_key oldn,
splay_tree_key newn, struct target_var_desc *tgt_var,
- unsigned char kind, bool always_to_flag,
+ unsigned char kind, bool always_to_flag, bool implicit,
struct gomp_coalesce_buf *cbuf,
htab_t *refcount_set)
{
@@ -550,11 +550,22 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
tgt_var->is_attach = false;
tgt_var->offset = newn->host_start - oldn->host_start;
- tgt_var->length = newn->host_end - newn->host_start;
+
+ /* For implicit maps, old contained in new is valid. */
+ bool implicit_subset = (implicit
+ && newn->host_start <= oldn->host_start
+ && oldn->host_end <= newn->host_end);
+ if (implicit_subset)
+ tgt_var->length = oldn->host_end - oldn->host_start;
+ else
+ tgt_var->length = newn->host_end - newn->host_start;
if ((kind & GOMP_MAP_FLAG_FORCE)
- || oldn->host_start > newn->host_start
- || oldn->host_end < newn->host_end)
+ /* For implicit maps, old contained in new is valid. */
+ || !(implicit_subset
+ /* Otherwise, new contained inside old is considered valid. */
+ || (oldn->host_start <= newn->host_start
+ && newn->host_end <= oldn->host_end)))
{
gomp_mutex_unlock (&devicep->lock);
gomp_fatal ("Trying to map into device [%p..%p) object when "
@@ -564,11 +575,17 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
}
if (GOMP_MAP_ALWAYS_TO_P (kind) || always_to_flag)
- gomp_copy_host2dev (devicep, aq,
- (void *) (oldn->tgt->tgt_start + oldn->tgt_offset
- + newn->host_start - oldn->host_start),
- (void *) newn->host_start,
- newn->host_end - newn->host_start, false, cbuf);
+ {
+ /* Implicit + always should not happen. If this does occur, below
+ address/length adjustment is a TODO. */
+ assert (!implicit_subset);
+
+ gomp_copy_host2dev (devicep, aq,
+ (void *) (oldn->tgt->tgt_start + oldn->tgt_offset
+ + newn->host_start - oldn->host_start),
+ (void *) newn->host_start,
+ newn->host_end - newn->host_start, false, cbuf);
+ }
gomp_increment_refcount (oldn, refcount_set);
}
@@ -576,8 +593,24 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
static int
get_kind (bool short_mapkind, void *kinds, int idx)
{
- return short_mapkind ? ((unsigned short *) kinds)[idx]
- : ((unsigned char *) kinds)[idx];
+ int val = (short_mapkind
+ ? ((unsigned short *) kinds)[idx]
+ : ((unsigned char *) kinds)[idx]);
+
+ if (short_mapkind && GOMP_MAP_IMPLICIT_P (val))
+ val &= ~GOMP_MAP_IMPLICIT;
+ return val;
+}
+
+
+static bool
+get_implicit (bool short_mapkind, void *kinds, int idx)
+{
+ int val = (short_mapkind
+ ? ((unsigned short *) kinds)[idx]
+ : ((unsigned char *) kinds)[idx]);
+
+ return short_mapkind && GOMP_MAP_IMPLICIT_P (val);
}
static void
@@ -631,6 +664,7 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
struct splay_tree_s *mem_map = &devicep->mem_map;
struct splay_tree_key_s cur_node;
int kind;
+ bool implicit;
const bool short_mapkind = true;
const int typemask = short_mapkind ? 0xff : 0x7;
@@ -638,12 +672,14 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
cur_node.host_end = cur_node.host_start + sizes[i];
splay_tree_key n2 = splay_tree_lookup (mem_map, &cur_node);
kind = get_kind (short_mapkind, kinds, i);
+ implicit = get_implicit (short_mapkind, kinds, i);
if (n2
&& n2->tgt == n->tgt
&& n2->host_start - n->host_start == n2->tgt_offset - n->tgt_offset)
{
gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
- kind & typemask, false, cbuf, refcount_set);
+ kind & typemask, false, implicit, cbuf,
+ refcount_set);
return;
}
if (sizes[i] == 0)
@@ -659,7 +695,8 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
== n2->tgt_offset - n->tgt_offset)
{
gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
- kind & typemask, false, cbuf, refcount_set);
+ kind & typemask, false, implicit, cbuf,
+ refcount_set);
return;
}
}
@@ -671,7 +708,8 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
&& n2->host_start - n->host_start == n2->tgt_offset - n->tgt_offset)
{
gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
- kind & typemask, false, cbuf, refcount_set);
+ kind & typemask, false, implicit, cbuf,
+ refcount_set);
return;
}
}
@@ -903,6 +941,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
for (i = 0; i < mapnum; i++)
{
int kind = get_kind (short_mapkind, kinds, i);
+ bool implicit = get_implicit (short_mapkind, kinds, i);
if (hostaddrs[i] == NULL
|| (kind & typemask) == GOMP_MAP_FIRSTPRIVATE_INT)
{
@@ -1085,8 +1124,8 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
}
}
gomp_map_vars_existing (devicep, aq, n, &cur_node, &tgt->list[i],
- kind & typemask, always_to_cnt > 0, NULL,
- refcount_set);
+ kind & typemask, always_to_cnt > 0, implicit,
+ NULL, refcount_set);
i += always_to_cnt;
}
else
@@ -1256,6 +1295,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
else if (tgt->list[i].key == NULL)
{
int kind = get_kind (short_mapkind, kinds, i);
+ bool implicit = get_implicit (short_mapkind, kinds, i);
if (hostaddrs[i] == NULL)
continue;
switch (kind & typemask)
@@ -1415,7 +1455,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
splay_tree_key n = splay_tree_lookup (mem_map, k);
if (n && n->refcount != REFCOUNT_LINK)
gomp_map_vars_existing (devicep, aq, n, k, &tgt->list[i],
- kind & typemask, false, cbufp,
+ kind & typemask, false, implicit, cbufp,
refcount_set);
else
{
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-1.c
new file mode 100644
index 00000000000..f2e72936862
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-1.c
@@ -0,0 +1,31 @@
+#ifdef __cplusplus
+extern "C"
+#else
+extern
+#endif
+void abort (void);
+
+int
+main (void)
+{
+ #define N 5
+ int array[N][N];
+
+ for (int i = 0; i < N; i++)
+ {
+ #pragma omp target enter data map(alloc: array[i:1][0:N])
+
+ #pragma omp target
+ for (int j = 0; j < N; j++)
+ array[i][j] = i + j;
+
+ #pragma omp target exit data map(from: array[i:1][0:N])
+ }
+
+ for (int i = 0; i < N; i++)
+ for (int j = 0; j < N; j++)
+ if (array[i][j] != i + j)
+ abort ();
+
+ return 0;
+}
More information about the Gcc-patches
mailing list