This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v3)


On 2018/12/19 5:03 AM, Thomas Schwinge wrote:
Hi Chung-Lin!

On Tue, 18 Dec 2018 23:06:38 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
this part includes some of the lookup_goacc_asyncqueue fixes we talked about.
I am still thinking about how the queue lock problem should really be solved, so regard
this patch as just fixing some of the problems.

Hi Thomas,
This is my solution to the queue lock stuff we talked about. I've only attached a diff to
be applied atop of the existing changes, though that may actually be easier to review.

Note that this is still in testing, which means it hasn't been tested :P, but I'm
posting to see if you have time to give it a look before the holidays.

Having the need for a lock on the async queues is quite irritating, especially
when the structure needed for managing them is quite simple.

Therefore, lets do away the need for locks entirely.

This patch makes the asyncqueue part of the device->openacc.async managed by lock-free
atomic operations; almost all of the complexity is contained in lookup_goacc_asyncqueue(),
so it should be not too complex. A descriptor and the queue array is allocated/exchanged
atomically when more storage is required, while in the common case a simple lookup is enough.
The fact that we manage asyncqueues by only expanding and never destroying asyncqueues
during the device lifetime also simplifies many things.

The current implementation may be not that optimized and clunky in some cases, but I think
this should be the way to implement what should be a fairly simple asyncqueue array and active
list. I'll update more on the status as testing proceeds.

(and about the other corners you noticed in the last mail, I'll get to that later...)

Thanks,
Chung-Lin







Sure, thanks.

Two comments, though:

--- libgomp/oacc-async.c	(revision 267226)
+++ libgomp/oacc-async.c	(working copy)

+attribute_hidden struct goacc_asyncqueue *
+lookup_goacc_asyncqueue (struct goacc_thread *thr, bool create, int async)
+{
+  /* The special value acc_async_noval (-1) maps to the thread-specific
+     default async stream.  */
+  if (async == acc_async_noval)
+    async = thr->default_async;
+
+  if (async == acc_async_sync)
+    return NULL;
+
+  if (async < 0)
+    gomp_fatal ("bad async %d", async);
+
+  struct gomp_device_descr *dev = thr->dev;
+
+  gomp_mutex_lock (&dev->openacc.async.lock);
+
+  if (!create
+      && (async >= dev->openacc.async.nasyncqueue
+	  || !dev->openacc.async.asyncqueue[async]))
+    {
+      gomp_mutex_unlock (&dev->openacc.async.lock);
+      return NULL;
+    }
+
+  if (async >= dev->openacc.async.nasyncqueue)
+    {
+      int diff = async + 1 - dev->openacc.async.nasyncqueue;
+      dev->openacc.async.asyncqueue
+	= gomp_realloc (dev->openacc.async.asyncqueue,
+			sizeof (goacc_aq) * (async + 1));
+      memset (dev->openacc.async.asyncqueue + dev->openacc.async.nasyncqueue,
+	      0, sizeof (goacc_aq) * diff);
+      dev->openacc.async.nasyncqueue = async + 1;
+    }
+
+  if (!dev->openacc.async.asyncqueue[async])
+    {
+      dev->openacc.async.asyncqueue[async] = dev->openacc.async.construct_func ();
+
+      if (!dev->openacc.async.asyncqueue[async])
+	{
+	  gomp_mutex_unlock (&dev->openacc.async.lock);
+	  gomp_fatal ("async %d creation failed", async);
+	}

That will now always fail for host fallback, where
"host_openacc_async_construct" just always does "return NULL".

Actually, if the device doesn't support asyncqueues, this whole function
should turn into some kind of no-op, so that we don't again and again try
to create a new one for every call to "lookup_goacc_asyncqueue".

I'm attaching one possible solution.  I think it's fine to assume that
the majority of devices will support asyncqueues, and for those that
don't, this is just a one-time overhead per async-argument.  So, no
special handling required in "lookup_goacc_asyncqueue".

+      /* Link new async queue into active list.  */
+      goacc_aq_list n = gomp_malloc (sizeof (struct goacc_asyncqueue_list));
+      n->aq = dev->openacc.async.asyncqueue[async];
+      n->next = dev->openacc.async.active;
+      dev->openacc.async.active = n;
+    }
+  gomp_mutex_unlock (&dev->openacc.async.lock);

You still need to keep "async" locked during...

+  return dev->openacc.async.asyncqueue[async];

... this dereference.

+}


Oh, and:

--- libgomp/oacc-plugin.c	(revision 267226)
+++ libgomp/oacc-plugin.c	(working copy)
@@ -31,14 +31,10 @@
  #include "oacc-int.h"
void
-GOMP_PLUGIN_async_unmap_vars (void *ptr, int async)
+GOMP_PLUGIN_async_unmap_vars (void *ptr __attribute__((unused)),
+			      int async __attribute__((unused)))
  {
-  struct target_mem_desc *tgt = ptr;
-  struct gomp_device_descr *devicep = tgt->device_descr;
-
-  devicep->openacc.async_set_async_func (async);
-  gomp_unmap_vars (tgt, true);
-  devicep->openacc.async_set_async_func (acc_async_sync);
+  gomp_fatal ("invalid plugin function");
  }

Please add a comment here, something like: "Obsolete entry point, no
longer used."


Grüße
  Thomas


Attachment: async-02.oacc-parts.v2-v3.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]