[PATCH] OpenACC 2.6 manual deep copy support (attach/detach)

Thomas Schwinge thomas@codesourcery.com
Sat Dec 22 13:37:00 GMT 2018


Hi Julian!

The following review has been done in a rush, so can be expected to
contain more errors than usually.


Unfortunately, you're posting here changes to: add new features (great!),
add independent other new features (great!), refactor code (great!), fix
bugs (great!) -- all great, but would be much easier to review if all
these were discussed separately.


I rebased your latest patch onto the current version of the async
re-work, as published in
<https://github.com/tschwinge/gcc/tree/wip-async_re-work>.

I had to resolve a number of (mostly easy) textual conflicts, but I also
had to apply additional changes.

As part of the async re-work review and following re-work of the patch,
the "gomp_unmap_vars_async" function is no more.  I have instead added
"bool finalize" to "gomp_unmap_vars_internal" etc.  For symmetry to
"gomp_unmap_vars", I moved "bool finalize" for "gomp_unmap_vars_async"
before "struct goacc_asyncqueue *aq".

Additionally, and a bit more invasive, "goacc_async_unmap_tgt" has been
removed, which you call from the "goacc_remove_var_async" function that
you're adding.  I have removed that function and instead adapted
"gomp_unref_tgt" to be usable for this case, too (added "bool
is_tgt_unmapped" return value), added a "gomp_unref_tgt_void" wrapper,
renamed "gomp_remove_var" to "gomp_remove_var_internal", and added
"gomp_remove_var" and "gomp_remove_var_async" wrappers, the latter
replacing "goacc_remove_var_async".

For easier review of the latter changes, I'm attaching an incremental
patch, "0001-WIP-OpenACC-2.6-manual-deep-copy-support-attach-deta.patch".
(Maybe Chung-Lin could help with that, too?)

The attached
"0001-OpenACC-2.6-manual-deep-copy-support-attach-detach.patch" then is
the complete patch, that I've now (somewhat) tested, and (somewhat)
reviewed.

I'll be OK with that going into trunk, with my following comments
addressed in some way, or disputed, should they not make a lot of sense,
after long weeks, now at 03:00 in the night.


On Fri, 14 Dec 2018 19:00:30 +0000, Julian Brown <julian@codesourcery.com> wrote:
> [...]

Thanks, Julian (and before, Cesar), for taking on this big item of work,
and Jakub for his earlier review comments.


> One small thing, additionally: I've introduced a function
> c_omp_map_clause_name in c-family/c-omp.c to return the name of OpenACC
> map clauses -- since many of these use OMP_CLAUSE_MAP and then use
> OMP_CLAUSE_MAP_KIND to distinguish between different mapping types,
> just emitting 'map' to refer to several different clauses won't be very
> meaningful to the user. I didn't see any existing way to get the right
> names.

ACK.  

Eventually (that is, later), this should move into generic code, next to
the other "clause printing".  Also to be shared with Fortran.

> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c

> @@ -12901,6 +12900,15 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>  		}
>  	      t = TREE_OPERAND (t, 0);
>  	    }
> +	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	    {
> +	      if (maybe_ne (mem_ref_offset (t), 0))
> +	        error_at (OMP_CLAUSE_LOCATION (c),
> +			  "cannot dereference %qE in %qs clause", t,
> +			  c_omp_map_clause_name (c, true));
> +	      else
> +		t = TREE_OPERAND (t, 0);
> +	    }
>  	}
>        if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
>  	{

No test cases exercising that?  (Can be addressed later.)

> +/* Ensure that pointers are used in OpenACC attach and detach clauses.
> +   Return true if an error has been detected.  */
> +
> +static bool
> +c_oacc_check_attachments (tree c)
> +{
> +  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
> +    return false;
> +
> +  /* OpenACC attach / detach clauses must be pointers.  */
> +  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> +      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
> +    {
> +      tree t = OMP_CLAUSE_DECL (c);
> +
> +      while (TREE_CODE (t) == TREE_LIST)
> +	t = TREE_CHAIN (t);
> +
> +      if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE)
> +	{
> +	  error_at (OMP_CLAUSE_LOCATION (c), "expected pointer in %qs clause",
> +		    c_omp_map_clause_name (c, true));
> +	  return true;
> +	}
> +    }
> +
> +  return false;
> +}

Share that with the C++ "cp_oacc_check_attachments"?  (Can be addressed
later.)

> @@ -14432,6 +14487,15 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  		}
>  	      if (remove)
>  		break;
> +	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	        {
> +		  if (maybe_ne (mem_ref_offset (t), 0))
> +	            error_at (OMP_CLAUSE_LOCATION (c),
> +			      "cannot dereference %qE in %qs clause", t,
> +			      c_omp_map_clause_name (c, true));
> +		  else
> +		    t = TREE_OPERAND (t, 0);
> +		}

Same error as above; no test cases exercising that?  (Can be addressed
later.)

Also, that error is not relevant for C++?

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c

> @@ -11819,8 +11866,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
>  			       OMP_CLAUSE_FINALIZE))
>      {
> -      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
> -	 semantics apply to all mappings of this OpenACC directive.  */
> +      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
> +	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
> +	 to all mappings of this OpenACC directive.  */
>        bool finalize_marked = false;
>        for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
>  	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
> @@ -11834,10 +11882,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
>  	      finalize_marked = true;
>  	      break;
> +	    case GOMP_MAP_DETACH:
> +	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
> +	      finalize_marked = true;
> +	      break;
> +	    case GOMP_MAP_STRUCT:
> +	    case GOMP_MAP_FORCE_PRESENT:
> +	      /* Skip over an initial struct or force_present mapping.  */
> +	      break;
>  	    default:
> -	      /* Check consistency: libgomp relies on the very first data
> -		 mapping clause being marked, so make sure we did that before
> -		 any other mapping clauses.  */
> +	      /* Check consistency: libgomp relies on the very first
> +		 non-struct, non-force-present data mapping clause being
> +		 marked, so make sure we did that before any other mapping
> +		 clauses.  */
>  	      gcc_assert (finalize_marked);
>  	      break;
>  	    }

But libgomp hasn't been updated accordingly, to look for
"GOMP_MAP_FORCE_DETACH" ("GOACC_enter_exit_data", "finalize" detection)?

> --- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
> +++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90

> --- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
> +++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95

No Fortran test cases for at least testing valid/invalid syntax of the
"attach" and "detach" clauses?

> --- a/include/gomp-constants.h
> +++ b/include/gomp-constants.h
> @@ -42,6 +42,7 @@
>  #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
>  #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
>  					 | GOMP_MAP_FLAG_SPECIAL_0)
> +#define GOMP_MAP_DEEP_COPY		(1 << 5)
>  /* Flag to force a specific behavior (or else, trigger a run-time error).  */
>  #define GOMP_MAP_FLAG_FORCE		(1 << 7)

I wondered whether we really need to reserve a whole class (one bit out
of eight) here?

> @@ -128,6 +129,13 @@ enum gomp_map_kind
>      /* Decrement usage count and deallocate if zero.  */
>      GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
>  					 | GOMP_MAP_DELETE),
> +    /* In OpenACC, attach a pointer to a mapped struct field.  */
> +    GOMP_MAP_ATTACH =			(GOMP_MAP_DEEP_COPY | 0),
> +    /* In OpenACC, detach a pointer to a mapped struct field.  */
> +    GOMP_MAP_DETACH =			(GOMP_MAP_DEEP_COPY | 1),
> +    /* In OpenACC, detach a pointer to a mapped struct field.  */
> +    GOMP_MAP_FORCE_DETACH =		(GOMP_MAP_DEEP_COPY
> +					 | GOMP_MAP_FLAG_FORCE | 1),

But anyway, we can still use for other things any other values with the
"GOMP_MAP_DEEP_COPY" bit set, should this become necessary later on.

Adding the "GOMP_MAP_FORCE_DETACH" mapping kind (that is,
"GOMP_MAP_DETACH | GOMP_MAP_FLAG_FORCE") could be avoided by using the
new mechanism 'For libgomp OpenACC entry points, redefine the "device"
argument to "flags"',
<https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01420.html>, that I plan
to get into trunk soon.  However, obviously, there's already precedence
to communicate via other "| GOMP_MAP_FLAG_FORCE" mapping variants to
libgomp the "finalize" semantics, so that'd be a general change now.
(... to be done later; just mentioning this now, for completeness.)

> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h

> @@ -902,6 +904,10 @@ struct target_mem_desc {
>     artificial pointer to "omp declare target link" object.  */
>  #define REFCOUNT_LINK (~(uintptr_t) 1)
>  
> +/* A special tag value for "virtual_refcount" in the splay_tree_key_s structure
> +   below.  */
> +#define VREFCOUNT_LINK_KEY (~(uintptr_t) 0)
> +
>  /* Special offset values.  */
>  #define OFFSET_INLINED (~(uintptr_t) 0)
>  #define OFFSET_POINTER (~(uintptr_t) 1)
> @@ -918,10 +924,21 @@ struct splay_tree_key_s {
>    uintptr_t tgt_offset;
>    /* Reference count.  */
>    uintptr_t refcount;
> -  /* Dynamic reference count.  */
> -  uintptr_t dynamic_refcount;
> -  /* Pointer to the original mapping of "omp declare target link" object.  */
> -  splay_tree_key link_key;
> +  /* Reference counts beyond those that represent genuine references in the
> +     linked splay tree key/target memory structures, e.g. for multiple OpenACC
> +     "present increment" operations (via "acc enter data") referring to the same
> +     host-memory block.
> +     If set to VREFCOUNT_LINK_KEY (for OpenMP, where this field is not otherwise
> +     needed), the union below represents a link key.  */
> +  uintptr_t virtual_refcount;
> +  union {
> +    /* For a block with attached pointers, the attachment counters for each.
> +       Only used for OpenACC.  */
> +    uintptr_t *attach_count;
> +    /* Pointer to the original mapping of "omp declare target link" object.
> +       Only used for OpenMP.  */
> +    splay_tree_key link_key;
> +  } u;
>  };

(Will reference that one later.)

> @@ -943,13 +960,6 @@ splay_compare (splay_tree_key x, splay_tree_key y)
>  
>  typedef struct acc_dispatch_t
>  {
> -  /* This is a linked list of data mapped using the
> -     acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
> -     Unlike mapped_data in the goacc_thread struct, unmapping can
> -     happen out-of-order with respect to mapping.  */
> -  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
> -  struct target_mem_desc *data_environ;

Also revise the comment in "libgomp.h" containing a "data_environ"
mention.

> +struct gomp_coalesce_buf;
> [...]
>  struct gomp_coalesce_buf;

Duplicate.  (I removed it, I think.)

> --- a/libgomp/oacc-init.c
> +++ b/libgomp/oacc-init.c
> @@ -302,9 +302,13 @@ acc_shutdown_1 (acc_device_t d)
>  
>        if (walk->dev)
>  	{
> -	  gomp_mutex_lock (&walk->dev->lock);
> -	  gomp_free_memmap (&walk->dev->mem_map);
> -	  gomp_mutex_unlock (&walk->dev->lock);
> +	  while (walk->dev->mem_map.root)
> +	    {
> +	      splay_tree_key k = &walk->dev->mem_map.root->key;
> +	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
> +		k->u.link_key = NULL;
> +	      gomp_remove_var (walk->dev, k);
> +	    }

Surprising to see here the OpenMP-only "VREFCOUNT_LINK_KEY", "u.link_key"
mentions, in OpenACC-specific code?

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -373,6 +364,7 @@ acc_unmap_data (void *h)

Hmm, just noting that "acc_unmap_data" looks rather similar to
"gomp_remove_var_internal" in what it's doing (but without the device
memory deallocation).  (Cleaning up any such things is for later,
obviously.)

> -  if (t->refcount == 2)
> +  if (tgt->refcount > 0)
> +    tgt->refcount--;
> +  else
>      {
> -[...]
> +      free (tgt->array);
> +      free (tgt);
>      }
>  
>    gomp_mutex_unlock (&acc_dev->lock);
> -
> -  gomp_unmap_vars (t, true);
>  }

But doesn't it need to duplicate the "free ([attach_count])" here,
though?

> +      if (aq)
> +	goacc_remove_var_async (acc_dev, n, aq);
> +      else
> +	gomp_remove_var (acc_dev, n);

I changed (two, I think) such things to just always call
"goacc_remove_var_async".

> +gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs,
> +			 size_t *sizes, unsigned short *kinds, int async,
> +			 bool finalize, int mapnum)

Should be "attribute_hidden"?

This one duplicates code from "delete_copyout", "GOACC_enter_exit_data"
(where the latter is also the only place it gets called from).  (Again,
that's been an issue before, for similar code, and certainly not the only
instance of such things, but we should sort this out eventually.)

This one has an unused "async" formal parameter.  It that meant to be
resolved to an asyncqueue, and pass that one to "gomp_copy_dev2host", and
call "gomp_remove_var_async" instead of "gomp_remove_var"?  That's here:

> +	  if (copyfrom)
> +	    gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start,
> +				(void *) (n->tgt->tgt_start + n->tgt_offset
> +					  + cur_node.host_start
> +					  - n->host_start),
> +				cur_node.host_end - cur_node.host_start);
> +
> +	  if (n->refcount == 0)
> +	    gomp_remove_var (acc_dev, n);
> +	  break;

> --- a/libgomp/oacc-parallel.c
> +++ b/libgomp/oacc-parallel.c

> @@ -356,6 +375,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>    if (mapnum > 0)
>      {
>        unsigned char kind = kinds[0] & 0xff;
> +
> +      if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT)
> +        kind = kinds[1] & 0xff;

Are we sure that "kinds[1]" is always valid to dereference?

I was generally confused by your "GOMP_MAP_FORCE_PRESENT" mapping kind
changes/usage in this whole patch, and in "GOACC_enter_exit_data"
specifically.  It seems as if you're changing the behavior of
"GOMP_MAP_FORCE_PRESENT" here, which -- at least initially -- might look
like a problem for ABI compatibility, for code compiled with previous
versions of GCC?  I see you're now during gimplification generating
"GOMP_MAP_FORCE_PRESENT" mappings for some "attach" cases.  This should
actually be OK, as no OpenACC directive that directly maps to
"GOACC_enter_exit_data" (that is, "update", "enter data", "exit data")
support the "present" clause ("GOMP_MAP_FORCE_PRESENT").  On the other
hand, "GOACC_enter_exit_data" is also called via "GOACC_declare", and the
"declare" directive does support "present" clauses -- but that's handled
separately in "GOACC_declare" itself, ha.  So, why did
"GOACC_enter_exit_data" then support "GOMP_MAP_FORCE_PRESENT" already?
If that was for no particular reason, and before, "GOACC_enter_exit_data"
never had to deal with "GOMP_MAP_FORCE_PRESENT" in practice, then we
don't have to worry, otherwise, do we have to better understand this?

In addition to the hunk above, I mean things like this:

> @@ -366,11 +389,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>      {
>        unsigned char kind = kinds[i] & 0xff;
>  
> -      if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET)
> +      if (kind == GOMP_MAP_POINTER
> +	  || kind == GOMP_MAP_TO_PSET
> +	  || kind == GOMP_MAP_STRUCT
> +	  || kind == GOMP_MAP_FORCE_PRESENT)
>  	continue;
>  
>        if (kind == GOMP_MAP_FORCE_ALLOC
> -	  || kind == GOMP_MAP_FORCE_PRESENT
> +	  || kind == GOMP_MAP_ATTACH
>  	  || kind == GOMP_MAP_FORCE_TO
>  	  || kind == GOMP_MAP_TO
>  	  || kind == GOMP_MAP_ALLOC)

> @@ -431,8 +472,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	    }
>  	  else
>  	    {
> -	      gomp_acc_insert_pointer (pointer, &hostaddrs[i],
> -				       &sizes[i], &kinds[i], async);
> +	      goacc_aq aq = get_goacc_asyncqueue (async);
> +	      for (int j = 0; j < 2; j++)
> +		gomp_map_vars_async (acc_dev, aq,
> +				     (j == 0 || pointer == 2) ? 1 : 2,
> +				     &hostaddrs[i + j], NULL,
> +				     &sizes[i + j], &kinds[i + j], true,
> +				     GOMP_MAP_VARS_OPENACC_ENTER_DATA);

Can we get a comment added to such "magic", please?

> @@ -440,51 +487,145 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	      i += pointer - 1;
>  	    }
>  	}
> +
> +      /* This loop only handles explicit "attach" clauses that are not an
> +	 implicit part of a copy{,in,out}, etc. mapping.  */
> +      for (i = 0; i < mapnum; i++)
> +        {
> +	  unsigned char kind = kinds[i] & 0xff;
> +
> +	  /* Scan for pointers and PSETs.  */
> +	  int pointer = find_pointer (i, mapnum, kinds);
> +
> +	  if (!pointer)
> +	    {
> +	      if (kind == GOMP_MAP_ATTACH)
> +		acc_attach (hostaddrs[i]);

Shouldn't this call the async variant?

> +	      else if (kind == GOMP_MAP_STRUCT)
> +	        i += sizes[i];
> +	    }
> +	  else
> +	    i += pointer - 1;
> +	}
>      }
>    else
> -    for (i = 0; i < mapnum; ++i)
> -      {
> -	unsigned char kind = kinds[i] & 0xff;
> +    {
> +      /* Handle "detach" before copyback/deletion of mapped data.  */
> +      for (i = 0; i < mapnum; i++)
> +        {
> +	  unsigned char kind = kinds[i] & 0xff;
>  
> -	int pointer = find_pointer (i, mapnum, kinds);
> +	  int pointer = find_pointer (i, mapnum, kinds);
>  
> -	if (!pointer)
> -	  {
> -	    switch (kind)
> -	      {
> -	      case GOMP_MAP_RELEASE:
> -	      case GOMP_MAP_DELETE:
> -		if (acc_is_present (hostaddrs[i], sizes[i]))
> +	  if (!pointer)
> +	    {
> +	      if (kind == GOMP_MAP_DETACH)
> +		acc_detach (hostaddrs[i]);
> +	      else if (kind == GOMP_MAP_FORCE_DETACH)
> +		acc_detach_finalize (hostaddrs[i]);
> +	      else if (kind == GOMP_MAP_STRUCT)
> +	        i += sizes[i];
> +	    }
> +	  else
> +	    {
> +	      unsigned char kind2 = kinds[i + pointer - 1] & 0xff;
> +
> +	      if (kind2 == GOMP_MAP_DETACH)
> +		acc_detach (hostaddrs[i + pointer - 1]);
> +	      else if (kind2 == GOMP_MAP_FORCE_DETACH)
> +	        acc_detach_finalize (hostaddrs[i + pointer - 1]);

Likewise for all these?

> --- a/libgomp/target.c
> +++ b/libgomp/target.c

> +void
> +gomp_attach_pointer (struct gomp_device_descr *devicep,
> +		     struct goacc_asyncqueue *aq, splay_tree mem_map,
> +		     splay_tree_key n, uintptr_t attach_to, size_t bias,
> +		     struct gomp_coalesce_buf *cbufp)

Should be "attribute_hidden"?

> +{
> +  struct splay_tree_key_s s;
> +  size_t size, idx;
> +
> +  if (n == NULL)
> +    {
> +      gomp_mutex_unlock (&devicep->lock);
> +      gomp_fatal ("enclosing struct not mapped for attach");
> +    }
> +
> +  size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *);
> +  /* We might have a pointer in a packed struct: however we cannot have more
> +     than one such pointer in each pointer-sized portion of the struct, so
> +     this is safe.  */
> +  idx = (attach_to - n->host_start) / sizeof (void *);
> +
> +  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
> +
> +  if (!n->u.attach_count)
> +    n->u.attach_count
> +      = gomp_malloc_cleared (sizeof (*n->u.attach_count) * size);

OK, good enough in terms of memory usage, I suppose.

> +void
> +gomp_detach_pointer (struct gomp_device_descr *devicep,
> +		     struct goacc_asyncqueue *aq, splay_tree_key n,
> +		     uintptr_t detach_from, bool finalize,
> +		     struct gomp_coalesce_buf *cbufp)

Should be "attribute_hidden"?

"gomp_detach_pointer" can't make use of its "struct gomp_coalesce_buf
*cbufp" formal parameter, I think?  (And for "gomp_attach_pointer" I have
not thought through whether that actually helps, but it's at least
plausible, being called from "gomp_map_vars_internal".)

> -static inline uintptr_t
> +uintptr_t
>  gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)

Should be "attribute_hidden"?

> @@ -899,13 +1064,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
>  				      kind & typemask, cbufp);
>  	    else
>  	      {
> -		k->link_key = NULL;
> +		if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
> +		  k->u.link_key = NULL;
>  		if (n && n->refcount == REFCOUNT_LINK)
>  		  {
>  		    /* Replace target address of the pointer with target address
>  		       of mapped object in the splay tree.  */
>  		    splay_tree_remove (mem_map, n);
> -		    k->link_key = n;
> +		    k->u.link_key = n;
> +		    k->virtual_refcount = VREFCOUNT_LINK_KEY;
>  		  }
>  		size_t align = (size_t) 1 << (kind >> rshift);
>  		tgt->list[i].key = k;
> @@ -926,10 +1093,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
>  		tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
>  		tgt->list[i].always_copy_from
>  		  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
> +		tgt->list[i].do_detach = false;
>  		tgt->list[i].offset = 0;
>  		tgt->list[i].length = k->host_end - k->host_start;
>  		k->refcount = 1;
> -		k->dynamic_refcount = 0;
> +		k->virtual_refcount = 0;
> +		k->u.attach_count = NULL;
>  		tgt->refcount++;
>  		array->left = NULL;
>  		array->right = NULL;

Don't the latter two assignments unconditionally overwrite the earlier
ones ("k->u.link_key") quoted above?  (This doesn't seem to cause any
testsuite regressions, though.)  I must be missing something.

> @@ -1283,7 +1491,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
>        k->tgt = tgt;
>        k->tgt_offset = target_table[i].start;
>        k->refcount = REFCOUNT_INFINITY;
> -      k->link_key = NULL;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);
> @@ -1315,7 +1524,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
>        k->tgt = tgt;
>        k->tgt_offset = target_var->start;
>        k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
> -      k->link_key = NULL;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);

Surprising to see here "k->virtual_refcount = 0" (that is, OpenACC case),
but then "k->u.link_key = NULL" (that is, OpenMP case).

> -/* Free address mapping tables.  MM must be locked on entry, and remains locked
> -   on return.  */
> -
> -attribute_hidden void
> -gomp_free_memmap (struct splay_tree_s *mem_map)
> -{
> -  while (mem_map->root)
> -    {
> -      struct target_mem_desc *tgt = mem_map->root->key.tgt;
> -
> -      splay_tree_remove (mem_map, &mem_map->root->key);
> -      free (tgt->array);
> -      free (tgt);
> -    }
> -}

Also remove the prototype in "libgomp.h".

> @@ -2631,6 +2825,8 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
>        k->tgt = tgt;
>        k->tgt_offset = (uintptr_t) device_ptr + device_offset;
>        k->refcount = REFCOUNT_INFINITY;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);

As above.

> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
> @@ -182,13 +182,13 @@ main (int argc, char **argv)
>          exit (EXIT_FAILURE);
>      }
>  
> +    acc_delete (&h_X[0], N * sizeof (float));
> +    acc_delete (&h_Y1[0], N * sizeof (float));
> +
>      free (h_X);
>      free (h_Y1);
>      free (h_Y2);
>  
> -    acc_free (d_X);
> -    acc_free (d_Y);
> -
>      context_check (pctx);
>  
>      s = cublasDestroy (h);
> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
> @@ -176,13 +176,13 @@ main (int argc, char **argv)
>          exit (EXIT_FAILURE);
>      }
>  
> +    acc_delete (&h_X[0], N * sizeof (float));
> +    acc_delete (&h_Y1[0], N * sizeof (float));
> +
>      free (h_X);
>      free (h_Y1);
>      free (h_Y2);
>  
> -    acc_free (d_X);
> -    acc_free (d_Y);
> -
>      context_check (pctx);
>  
>      s = cublasDestroy (h);

Hmm, why is this not also required for "context-1.c" and "context-3.c"?

> [...]

Thanks for adding at least a good number of libgomp test cases.

There are, however, no libgomp test cases for acc_detach_finalize, and
generally none for any of the several async variants (including using the
respective OpenACC directives).


As I said above, I'm fine with the changes generally, and some/most? of
my comments can be dealt with later on.


Grüße
 Thomas


-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-WIP-OpenACC-2.6-manual-deep-copy-support-attach-deta.patch
Type: text/x-diff
Size: 6717 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181222/38b0785d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-OpenACC-2.6-manual-deep-copy-support-attach-detach.patch
Type: text/x-diff
Size: 155710 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181222/38b0785d/attachment-0001.bin>


More information about the Gcc-patches mailing list