This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libgomp/83457] New: Add fhost-simt-vf


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83457

            Bug ID: 83457
           Summary: Add fhost-simt-vf
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 42898
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42898&action=edit
Add fhost-simt-{vf,lane} and fassume-omp-{nthreads,threadid}

The simt code path for fopenmp is only exercised in the offloading context, and
only for the nvptx target.

This demonstrator patch enables it partially on the host.

Using the patch, I managed to reproduce PR81778 on x86_64.

Consider test.c:
...
#include <stdio.h>

extern void abort ();

#define N 4

int
main ()
{
#pragma omp target parallel for simd schedule(static, 2) num_threads(1) 
  for (unsigned int i = N; i > 0 ; i -= 1)
    {
      printf ("%d\n", i);
      if (!(0 < i  && i <= N))
        abort ();
    }

  return 0;
}
...

The normal fno-openmp execution is:
....
$ gcc test.c && ./a.out 
4
3
2
1
...

When executing for simt vectorization factor 2 on lane 0 all goes ok:
...
$ gcc test.c -O2 -fopenmp -fhost-simt-vf=2 -fhost-simt-lane=0 && ./a.out 
4
2
...

But when executing for lane 1, we run into the error:
...
$ gcc test.c -O2 -fopenmp -fhost-simt-vf=2 -fhost-simt-lane=1 && ./a.out 
3
1
-1
Aborted (core dumped)
...

With two additional options to simplify the code:
...
$ gcc test.c -O2 -fopenmp \
  -fhost-simt-vf=2 \
  -fhost-simt-lane=1 \
  -fassume-omp-threadid=0 \
  -fassume-omp-nthreads=1 \
  -fdump-tree-all
...

at optimized we have an unconditional abort:
...
;; Function main._omp_fn.1 (main._omp_fn.1, funcdef_no=13, decl_uid=2510,
cgraph_uid=13, symbol_order\
=13) (executed once)

__attribute__((omp declare target))
main._omp_fn.1 (void * .omp_data_i)
{
  <bb 2> [local count: 403773193]:
  printf ("%d\n", 3);
  printf ("%d\n", 1);
  printf ("%d\n", 4294967295);
  abort ();

}
...

Or, without the abort, an eternal loop:
...
__attribute__((omp declare target))
main._omp_fn.1 (void * .omp_data_i)
{
  unsigned int i;

  <bb 2> [local count: 67098799]:
  printf ("%d\n", 3);

  <bb 3> [local count: 469691593]:
  # i_29 = PHI <1(2), i_36(3)>
  printf ("%d\n", i_29);
  i_36 = i_29 + 4294967294;
  goto <bb 3>; [100.00%]

}
...

So, the error can be analyzed and fixed entirely in the host and pre-rtl
domain.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]