This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PR libgomp/56073: benchmark regression due to PR libgomp/51376 fix
- From: Alan Modra <amodra at gmail dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Jakub Jelinek <jakub at redhat dot com>, Richard Henderson <rth at redhat dot com>
- Date: Wed, 23 Jan 2013 11:41:52 +1030
- Subject: Re: PR libgomp/56073: benchmark regression due to PR libgomp/51376 fix
- References: <20130122110324.GL3244@bubble.grove.modra.org> <1358860823.3101.225.camel@triegel.csb>
On Tue, Jan 22, 2013 at 02:20:23PM +0100, Torvald Riegel wrote:
> > @@ -116,11 +116,10 @@ GOMP_task (void (*fn) (void *), void *data, void (
> > }
> > else
> > fn (data);
> > - if (team != NULL)
> > + if (task.children != NULL)
>
> Why does this not need an acquire barrier?
I explained that in my patch submission. Briefly, the only way this
particular task.children can be set is if this thread's task work
function creates children. So since the setter is *this* thread, we
need no barriers. We can have task.children set by the current
thread then changed by a child thread, but seeing a stale non-NULL
value is not a problem here. Once past the task_lock acquisition,
this thread will see the real value of task.children.
> > {
> > gomp_mutex_lock (&team->task_lock);
> > - if (task.children != NULL)
>
> Are you sure that you can remove this check to task.children here,
> especially if you don't use an acquire barrier previously?
>
> Can there be several threads trying to call gomp_clear_parent?
Yes and yes. gomp_clear_parent handles a NULL arg, and as you can
see, runs inside a task_lock mutex region here.
> > - gomp_clear_parent (task.children);
> > + gomp_clear_parent (task.children);
> > gomp_mutex_unlock (&team->task_lock);
>
> Please add comments or reference the comments in the other pieces of
> synchronizing code related to this.
Please note that all of the above is a reversion! Adding a comment
has the risk that the comment is *wrong*. That said, I appreciate the
need to document threaded code well..
> > - if (task == NULL || team == NULL)
> > + if (task == NULL
> > + || __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL)
>
> Please mention with which stores this acquire load synchronizes.
How does this look?
* task.c (GOMP_task, GOMP_taskwait): Comment.
Index: libgomp/task.c
===================================================================
--- libgomp/task.c (revision 195370)
+++ libgomp/task.c (working copy)
@@ -116,6 +116,15 @@
}
else
fn (data);
+ /* Access to "children" is normally done inside a task_lock
+ mutex region, but the only way this particular task.children
+ can be set is if this thread's task work function (fn)
+ creates children. So since the setter is *this* thread, we
+ need no barriers here when testing for non-NULL. We can have
+ task.children set by the current thread then changed by a
+ child thread, but seeing a stale non-NULL value is not a
+ problem. Once past the task_lock acquisition, this thread
+ will see the real value of task.children. */
if (task.children != NULL)
{
gomp_mutex_lock (&team->task_lock);
@@ -296,6 +305,12 @@
struct gomp_task *child_task = NULL;
struct gomp_task *to_free = NULL;
+ /* The acquire barrier on load of task->children here synchronizes
+ with the write of a NULL in gomp_barrier_handle_tasks. It is
+ not necessary that we synchronize with other non-NULL writes at
+ this point, but we must ensure that all writes to memory by a
+ child thread task work function are seen before we exit from
+ GOMP_taskwait. */
if (task == NULL
|| __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL)
return;
--
Alan Modra
Australia Development Lab, IBM