This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [gomp4] Library side of depend clause support

From: Jakub Jelinek <jakub at redhat dot com>
To: Richard Henderson <rth at redhat dot com>
Cc: Torvald Riegel <triegel at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Fri, 27 Sep 2013 01:48:36 +0200
Subject: Re: [gomp4] Library side of depend clause support
Authentication-results: sourceware.org; auth=none
References: <20130926183624 dot GI30970 at tucnak dot zalov dot cz> <5244BB11 dot 2030708 at redhat dot com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Thu, Sep 26, 2013 at 03:54:09PM -0700, Richard Henderson wrote:
> On 09/26/2013 11:36 AM, Jakub Jelinek wrote:
> > +struct gomp_task;
> >  struct gomp_taskgroup;
> > +struct htab;
> > +
> > +struct gomp_task_depend_entry
> > +{
> > +  void *addr;
> > +  struct gomp_task_depend_entry *next;
> > +  struct gomp_task_depend_entry *prev;
> > +  struct gomp_task *task;
> > +  bool is_in;
> > +  bool redundant;
> > +};
> 
> I'm a bit confused about the combination of linked lists and reallocated
> arrays.  When did you decide to use what?

I initially wanted to use linked lists only, but while I can statically
preallocate the chains for the hash table, for the depender -> dependee
chains where a task may depend on many other tasks that would mean having to
allocate small structures (or pool allocate them, per team?).

> I wonder if we ought always defer tasks with dependencies and skip this lock
> and search?  Unless the taskgroup is truly weird, we *are* going to have
> dependencies between the tasks.  Dunno what exactly to do with final_tasks
> that have unfulfilled dependencies...

I think final tasks aren't a problem, if the parent is a final task, then
all the children are non-deferred, thus we never record any dependencies
and the test for that will be cheap too (because parent->depend_hash will be
NULL).  The problem is if (0) tasks, the spec says that they must be
non-deferred unless they depend on some earlier non-finished task.  But
the cost in that case is primarily in taking the lock/unlock; the search
will stop on the first dependency found, if there aren't any, nothing will
be recorded and we don't jump to the defer label, if there are some, as soon
as we discover first we jump there.

> I also think it would significantly clean up the code to declare a struct with
> a variable tail for the depend argument.  That would eliminate all of the
> casting to uintptr_t and give names to the first two entries.

I agree if we keep using realloced vectors that flexible array would make it
cleaner.

> Perhaps what we need are smaller linked list entries like
> 
>   struct gomp_task_depend_node {
>      struct gomp_task *task;
>      struct gomp_task_depend_node *next;
>      struct gomp_task_depend_node *prev;
>   };

The dependee -> depender vectors resp. linked lists are just pushed to first
(the only thing needed during insertion is to have a cheap check if the last
inserted task is the current one, to avoid having the same task multiple
times in the vector/linked list), and then just walked once when the
dependee finishes, so no removal is needed there, it can be freed at once;
thus, for linked list, it would be enough to use non-doubly linked list for
that.  For the hash table chains, unless we want to always lookup the hash
table and walk the chains for removal, we need doubly linked list.
> 
> and a different hash table entry like
> 
>   struct gomp_task_depend_head {
>     void *addr;
>     struct gomp_task_depend_node *outs;
>     struct gomp_task_depend_node *ins;
>     size_t n_ins;
>   };

You mean that the hash table instead would contain the structures, or
pointers to these structures?  If the latter (not sure what n_ins would be
for), then we'd again need to pool alloc them.
> 
> If we scan the ndepend entries twice, we can find out how many nodes we need,
> and allocate them with the task like you do now.  Scanning the ndepends array
> twice can be sped by only looking up the hash table entries once -- allocate a
> local array of size ndepend entries and cache the lookups.
> 
> I'd say we don't need a count of the n_outs because all of them on the list
> must be sequentially dependent.  Thus any new task simply depends on the
> previous task in the outs list.  Thus imo it makes sense to have ins/outs point
> to the tail of the list as opposed to the head.

Ah, haven't thought about it this way, yes, you're right that for out/inout
dependencies it is enough to remember in the hash table the last one,
because the dependencies will form a chain on the same address and serialize
the tasks.

> Is is really worthwhile to detect redundant dependencies?  It seems just as
> easy to add multiple dependencies and let them just fall out naturally.

I just didn't want to have duplicates in the hash table chains, the
redundant flag is just a sign that the entry doesn't need to be removed
from the hash table chains.

> OTOH, perhaps you should just go ahead with this patch and we can evolve it
> incrementally.  I don't see anything technically wrong with it.

Perhaps.  What if I do just minor cleanup (use flexible array members for
the reallocated vectors, and perhaps keep only the last out/inout task
in the hash table chains rather than all of them), retest, commit and then
we can discuss/incrementally improve it?

	Jakub

Follow-Ups:
- Re: [gomp4] Library side of depend clause support
  - From: Jakub Jelinek

References:
- [gomp4] Library side of depend clause support
  - From: Jakub Jelinek
- Re: [gomp4] Library side of depend clause support
  - From: Richard Henderson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]