This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [libgomp, WIP, GSoC'19] Modification to a single queue, single execution path.


Hi!

Thanks for the patch.

On Mon, Jul 22, 2019 at 04:40:21AM +0900, 김규래 wrote:

Can you please try to tweak your mailer settings?  In the text version of
the patch the mailer ate tab characters, so the patch can't apply, and in
the html version which we generally don't want to see on the mailing list,
one has to open it in some web browser, copy from there and replace \n\n
with \n to get something that can apply.

> @@ -447,7 +451,7 @@ struct gomp_task
>    /* Parent of this task.  */
>    struct gomp_task *parent;
>    /* Children of this task.  */
> -  struct priority_queue children_queue;
> +  /* struct priority_queue children_queue; */

Generally, we don't want to have code commented out like this in the final
patch submission.  For this WIP, I think it is acceptable, as I think in the
end you don't want to use the team's queue, but actually either
children_queue (renamed), but only use it on the implicit tasks, or during
team creation allocate together with the team structure also memory that
would be used as a trailing array for an array of the implicit queues next
to the array of implicit tasks.

>    /* The priority node for this task in each of the different queues.
>       We put this here to avoid allocating space for each priority
>       node.  Then we play offsetof() games to convert between pnode[]
>       entries and the gomp_task in which they reside.  */
> -  struct priority_node pnode[3];
> +  struct priority_node pnode;

If there is just one priority_node, the above comment can go,

> @@ -1211,7 +1218,7 @@ extern int gomp_test_nest_lock_25 (omp_nest_lock_25_t *) __GOMP_NOTHROW;
>  static inline size_t
>  priority_queue_offset (enum priority_queue_type type)
>  {
> -  return offsetof (struct gomp_task, pnode[(int) type]);
> +  return offsetof (struct gomp_task, pnode);

this whole routine as well and we just don't need offsetof anymore, nor
priority_queue_type anywhere, PQ_*, etc., just use the single pnode.

> @@ -182,8 +128,8 @@ gomp_task_handle_depend (struct gomp_task *task, struct gomp_task *parent,
>      }
>    else
>      {
> -      ndepend = (uintptr_t) depend[1]; /* total # */
> -      size_t nout = (uintptr_t) depend[2]; /* # of out: and inout: */
> +      ndepend = (uintptr_t) depend[1];      /* total # */
> +      size_t nout = (uintptr_t) depend[2];    /* # of out: and inout: */

The patch contains a lot of changes like the above one, a small portion
improves formatting, but most of them make it worse or are unnecessary.
The above one seems unnecessary (what is wrong is just space followed by
tab, that should never happen).

> @@ -235,8 +181,8 @@ gomp_task_handle_depend (struct gomp_task *task, struct gomp_task *parent,
>        task->depend[i].redundant = false;
>        task->depend[i].redundant_out = false;
>  
> -      hash_entry_type *slot = htab_find_slot (&parent->depend_hash,
> -       &task->depend[i], INSERT);
> +      hash_entry_type *slot
> + = htab_find_slot (&parent->depend_hash, &task->depend[i], INSERT);
>        hash_entry_type out = NULL, last = NULL;
>        if (*slot)

This change is correct formatting-wise, but the old one was correct too, so
there is no reason to change it like that.

>   {
> @@ -282,13 +228,12 @@ gomp_task_handle_depend (struct gomp_task *task, struct gomp_task *parent,
>   continue;
>         else if (tsk->dependers->n_elem == tsk->dependers->allocated)
>   {
> -   tsk->dependers->allocated
> -     = tsk->dependers->allocated * 2 + 2;
> +   tsk->dependers->allocated = tsk->dependers->allocated * 2 + 2;

This one is wrong, at the line is already too long after the change by 1
character.

>     tsk->dependers
>       = gomp_realloc (tsk->dependers,
>       sizeof (struct gomp_dependers_vec)
> -     + (tsk->dependers->allocated
> -        * sizeof (struct gomp_task *)));
> +       + (tsk->dependers->allocated
> + * sizeof (struct gomp_task *)));

The old formatting was good, the new one is not.
>   }
>         tsk->dependers->elem[tsk->dependers->n_elem++] = task;
>         task->num_dependees++;
> @@ -371,8 +316,7 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
>   {
>     if (thr->task->taskgroup->cancelled)
>       return;
> -   if (thr->task->taskgroup->workshare
> -       && thr->task->taskgroup->prev
> +   if (thr->task->taskgroup->workshare && thr->task->taskgroup->prev
>         && thr->task->taskgroup->prev->cancelled)

Again.  The coding conventions have a rule that if the whole condition
fits on one line, then it should be on one line, but if it doesn't, each
&& or || operand should go on a separate line (I know, there are several
spots where the code doesn't honor it, but this one wasn't the case).

>       return;
>   }
> @@ -383,8 +327,7 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
>    else if (priority > gomp_max_task_priority_var)
>      priority = gomp_max_task_priority_var;
>  
> -  if (!if_clause || team == NULL
> -      || (thr->task && thr->task->final_task)
> +  if (!if_clause || team == NULL || (thr->task && thr->task->final_task)
>        || team->task_count > 64 * team->nthreads)

Like here.  So if you wanted to change the formatting, the right change
would be to move || team == NULL on a separate line.

> @@ -429,12 +372,6 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
>   child thread, but seeing a stale non-NULL value is not a
>   problem.  Once past the task_lock acquisition, this thread
>   will see the real value of task.children.  */

The comment will need to go or be adjusted with this change eventually.

> @@ -449,8 +386,8 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
>        if (flags & GOMP_TASK_FLAG_DEPEND)
>   depend_size = ((uintptr_t) (depend[0] ? depend[0] : depend[1])
>          * sizeof (struct gomp_task_depend_entry));
> -      task = gomp_malloc (sizeof (*task) + depend_size
> -   + arg_size + arg_align - 1);
> +      task
> + = gomp_malloc (sizeof (*task) + depend_size + arg_size + arg_align - 1);

Too long again.

> -ialias (GOMP_taskgroup_start)
> -ialias (GOMP_taskgroup_end)
> -ialias (GOMP_taskgroup_reduction_register)
> +ialias (GOMP_taskgroup_start) ialias (GOMP_taskgroup_end)
> +  ialias (GOMP_taskgroup_reduction_register)

You don't want multiple ialiases on the same line, nor the indentation.

> @@ -563,10 +486,9 @@ ialias (GOMP_taskgroup_reduction_register)
>  #undef UTYPE
>  #undef GOMP_taskloop
>  
> -static void inline
> -priority_queue_move_task_first (enum priority_queue_type type,
> - struct priority_queue *head,
> - struct gomp_task *task)
> +    static void inline priority_queue_move_task_first (
> +      enum priority_queue_type type, struct priority_queue *head,
> +      struct gomp_task *task)

The coding conventions say the previous formatting was correct, the return
type etc. go on one line, but the function name starts a new line,
( at the end of line is highly undesirable.

>    task->kind = GOMP_TASK_WAITING;
> -  if (parent && parent->taskwait)
> +  if (parent)
>      {
> -      if (parent->taskwait->in_taskwait)
> +      if (parent->taskwait)

Why this change?  parent && parent->taskwait needs smaller indentation,
and I don't see you adding say else code for the parent->taskwait
case.

> @@ -674,10 +590,10 @@ static void gomp_task_run_post_handle_depend_hash (struct gomp_task *);
>  /* Called for nowait target tasks.  */
>  
>  bool
> -gomp_create_target_task (struct gomp_device_descr *devicep,
> - void (*fn) (void *), size_t mapnum, void **hostaddrs,
> - size_t *sizes, unsigned short *kinds,
> - unsigned int flags, void **depend, void **args,
> +gomp_create_target_task (struct gomp_device_descr *devicep, void (*fn) (void *),

Too long line.

> -      && --parent->taskwait->n_depend == 0
> -      && parent->taskwait->in_depend_wait)
> +      && --parent->taskwait->n_depend == 0 && parent->taskwait->in_depend_wait)
>      {
>        parent->taskwait->in_depend_wait = false;
>        gomp_sem_post (&parent->taskwait->taskwait_sem);
>      }

> + /*  Previously, the dependencies were upgraded their priorities.
> +     I'm not sure if not upgrading the depedencies will not lead
> +     to a possible deadlock in a single queue situation. */

For comment formatting, we do not want to start with two spaces after /*,
but want to have two spaces after full stop, even at the end, so
/* Previously ...
   I'm not ...
   ... situation.  */
>  
> - finish:
> +finish:

Labels are to be indented by one fewer columns compared to what follows,
so the old one was correct.
> @@ -2095,7 +1618,7 @@ gomp_reduction_register (uintptr_t *data, uintptr_t *old, uintptr_t *orig,
>        to hash also on the first sizeof (uintptr_t) bytes which contain
>        a pointer.  Hide the cast from the compiler.  */
>     hash_entry_type n;
> -   __asm ("" : "=g" (n) : "0" (p));
> +   __asm("" : "=g"(n) : "0"(p));

The spaces before ( are correct in all 3 spots.

> @@ -2192,14 +1715,13 @@ GOMP_taskgroup_reduction_unregister (uintptr_t *data)
>  }
>  ialias (GOMP_taskgroup_reduction_unregister)
>  
> -/* For i = 0 to cnt-1, remap ptrs[i] which is either address of the
> -   original list item or address of previously remapped original list
> -   item to address of the private copy, store that to ptrs[i].
> -   For i < cntorig, additionally set ptrs[cnt+i] to the address of
> -   the original list item.  */
> +  /* For i = 0 to cnt-1, remap ptrs[i] which is either address of the
> +     original list item or address of previously remapped original list
> +     item to address of the private copy, store that to ptrs[i].
> +     For i < cntorig, additionally set ptrs[cnt+i] to the address of
> +     the original list item.  */

Function comments should have /* at the start of the line, not indented
further.

>   gomp_fatal ("couldn't find matching task_reduction or reduction with "
> -     "task modifier for %p", ptrs[i]);
> +     "task modifier for %p",
> +     ptrs[i]);

No need to put the argument on a separate line.

As mentioned earlier, I think the dependency handling could be guarded by a
lock in the task whose parent_hash etc. is going to be used, for the
per-implicit task queues see above and it could again use a lock in the
implicit task whose queue will be used.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]