This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [WIP] [PR fortran/72741] Rework Fortran OpenACC routine clause handling


On 08/11/2016 08:18 AM, Thomas Schwinge wrote:

>> --- a/gcc/fortran/module.c
>> +++ b/gcc/fortran/module.c
> 
>>  [...]
>> +DECL_MIO_NAME (oacc_function)
>>  [...]
> 
> As discussed between Cesar and Tobias, these module.c/symbol.c changes
> introduce an incompatibility in the Fortran module file format, which
> we'll want to avoid.  Reverting that to use individual bit flags instead
> of the "enum oacc_function", I think that we're safe (but I have not
> verified that).  On the other hand, given that I'm not at all handling in
> module.c/symbol.c the new "locus omp_clauses_locus" and "struct
> symbol_attribute *next" members that I'm adding to "symbol_attribute",
> I'm not sure whether I'm actually testing this properly.  ;-) I guess I'm
> not.

How are you testing it? Basically, what you need to do is create two
source files, one containing a module and another with the program unit.
Then compile one of those files with the old, say gcc6 fortran, and the
other with trunk gfortran and try to link the .o files together.

I've attached some test cases so that you can experiment with. Each
driver file corresponds to a test file, with the exception of
test-driver which uses both test-interface.f90 and test-module.f90.

>> --- a/gcc/fortran/openmp.c
>> +++ b/gcc/fortran/openmp.c
> 
>> @@ -1814,7 +1824,10 @@ gfc_match_oacc_routine (void)
>>  	  != MATCH_YES))
>>      return MATCH_ERROR;
>>  
>> -  if (sym != NULL)
>> +  if (isym != NULL)
>> +    /* There is nothing to do for intrinsic procedures.  */
>> +    ;
> 
> We will want to check that no incompatible clauses are being specified,
> for example (but, low priority).  I'm adding a hacky implementation of
> that.

So this is what I was overlooking in PR72741. For some reason I was only
considering invalid clauses of the form

  !$acc routine gang worker

and not actually checking for compatible parallelism at the call sites.
The title "Fortran OpenACC routine directive doesn't properly handle
clauses specifying the level of parallelism" was kind of misleading.

Shouldn't the oaccdevlow pass already catch these types of errors already?

>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/routine-7.f90
>> @@ -0,0 +1,69 @@
>> +! Test acc routines inside modules.
>> +
>> +! { dg-additional-options "-O0" }
> 
> -O0 to prevent inlining of functions tagged with OpenACC routine
> directives, or another reason?

I'm not sure why, but that's probably it.

>> --- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
>> @@ -1,121 +1,95 @@
>> +! Test acc routines inside modules.
>>  
>>  ! { dg-do run }
>> -! { dg-additional-options "-cpp" }
>>  
>> -#define M 8
>> -#define N 32
>> +module routines
>> +  integer, parameter :: N = 32
>>  
>> -program main
>> -  integer :: i
>> -  integer :: a(N)
>> -  integer :: b(M * N)
>> -
>> -  do i = 1, N
>> -    a(i) = 0
>> -  end do
>> +contains
>> +  subroutine vector (a)
>> +    implicit none
>> +    !$acc routine vector
>> +    integer, intent (inout) :: a(N)
>> +    integer :: i
>> [...]
> 
> This seems to completely rewrite the test case.  Is that intentional, or
> should the original test case be preserved, and the changed/new/rewritten
> one be added as a new test case?

The original test was completely bogus because it had a lot of race
conditions when writing to variables. I could restore it, but then you'd
need to remove all of the gang, worker, vector clauses and force it to
run in seq. But that would defeat the intent behind the patch.

> Now, my hacky WIP patch.
> 
> One big chunk of the gcc/fortran/gfortran.h changes is just to move some
> stuff around, without any changes, so that I can use "locus" in
> "symbol_attribute".

I agree with Jakub about creating a new gfc_acc_routine struct to
contain the locus and clauses for acc routines. That way, you can also
link them together for device_type.

But at the same time, since device_type isn't a priority for in the near
term, we might be better off using the existing oacc_function and nohost
attribute bits instead of introducing a new struct.

> I very much "cargo cult"ed all that "oacc_routine*" bit flag stuff in
> module.c/symbol.c, replicating what's being done for "omp target
> declare", without really knowing what I'm doing there.  I will appreciate
> test cases actually exercising this code -- which doesn't currently at
> all handle the new "locus omp_clauses_locus" and "struct symbol_attribute
> *next" members that I'm adding to "symbol_attribute", as I've mentioned
> before.  (But I suppose it should?)
> 
> We're not implementing the OpenACC device_type clause at the moment, so
> the "TODO: handle device_type clauses" comment in
> gcc/fortran/openmp.c:gfc_match_oacc_routine is not a concern right now.
> 
> With these changes, we're now actually also paying attention the clauses
> specified with the OpenACC routine directive with a name -- one of the
> things mentioned as missing in <http://gcc.gnu.org/PR72741> "Fortran
> OpenACC routine directive doesn't properly handle clauses specifying the
> level of parallelism".
> 
> To handle several "pending" OpenACC routine directives, I had to add the
> "struct symbol_attribute *next" member to "symbol_attribute" -- I hope
> that doesn't disqualify the proposed changes as too ugly.  (Several other
> structs already contain such "next" pointers, and the use is very much
> confined to only the OpenACC routine directive.)  I will of course be
> happy to learn about a better/different way to do this.
> 
> commit ca4a098dab72f27c6e1121aa7e5e49764921974e
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Thu Aug 11 16:34:22 2016 +0200
> 
>     [WIP] [PR fortran/72741] Rework Fortran OpenACC routine clause handling

>  /* Structure and list of supported extension attributes.  */
>  typedef enum
>  {
> @@ -729,7 +839,7 @@ ext_attr_t;
>  extern const ext_attr_t ext_attr_list[];
>  
>  /* Symbol attribute structure.  */
> -typedef struct
> +typedef struct symbol_attribute
>  {
>    /* Variable attributes.  */
>    unsigned allocatable:1, dimension:1, codimension:1, external:1, intrinsic:1,
> @@ -864,6 +974,13 @@ typedef struct
>    /* Mentioned in OMP DECLARE TARGET.  */
>    unsigned omp_declare_target:1;
>  
> +  /* OpenACC routine.  */
> +  unsigned oacc_routine:1;
> +  unsigned oacc_routine_gang:1;
> +  unsigned oacc_routine_worker:1;
> +  unsigned oacc_routine_vector:1;
> +  unsigned oacc_routine_seq:1;
> +
>    /* Mentioned in OACC DECLARE.  */
>    unsigned oacc_declare_create:1;
>    unsigned oacc_declare_copyin:1;
> @@ -871,137 +988,24 @@ typedef struct
>    unsigned oacc_declare_device_resident:1;
>    unsigned oacc_declare_link:1;
>  
> -  /* This is an OpenACC acclerator function at level N - 1  */
> -  ENUM_BITFIELD (oacc_function) oacc_function:3;
> -

I'm not sure what's better from a stylistic standpoint. Personally, I'd
prefer if all of these extra bits were coalesced into an oacc_routine
and oacc_declare enums. At least for acc routines, gang, worker, vector
and seq are all mutually exclusive.

> +++ gcc/fortran/module.c
> @@ -1986,6 +1986,7 @@ enum ab_attribute
>    AB_IS_CLASS, AB_PROCEDURE, AB_PROC_POINTER, AB_ASYNCHRONOUS, AB_CODIMENSION,
>    AB_COARRAY_COMP, AB_VTYPE, AB_VTAB, AB_CONTIGUOUS, AB_CLASS_POINTER,
>    AB_IMPLICIT_PURE, AB_ARTIFICIAL, AB_UNLIMITED_POLY, AB_OMP_DECLARE_TARGET,
> +  AB_OACC_ROUTINE, AB_OACC_ROUTINE_GANG, AB_OACC_ROUTINE_WORKER, AB_OACC_ROUTINE_VECTOR, AB_OACC_ROUTINE_SEQ,
>    AB_ARRAY_OUTER_DEPENDENCY, AB_MODULE_PROCEDURE, AB_OACC_DECLARE_CREATE,
>    AB_OACC_DECLARE_COPYIN, AB_OACC_DECLARE_DEVICEPTR,
>    AB_OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_LINK
> @@ -2044,6 +2045,11 @@ static const mstring attr_bits[] =
>      minit ("IMPLICIT_PURE", AB_IMPLICIT_PURE),
>      minit ("UNLIMITED_POLY", AB_UNLIMITED_POLY),
>      minit ("OMP_DECLARE_TARGET", AB_OMP_DECLARE_TARGET),
> +    minit ("OACC_ROUTINE", AB_OACC_ROUTINE),
> +    minit ("OACC_ROUTINE_GANG", AB_OACC_ROUTINE_GANG),
> +    minit ("OACC_ROUTINE_WORKER", AB_OACC_ROUTINE_WORKER),
> +    minit ("OACC_ROUTINE_VECTOR", AB_OACC_ROUTINE_VECTOR),
> +    minit ("OACC_ROUTINE_SEQ", AB_OACC_ROUTINE_SEQ),
>      minit ("ARRAY_OUTER_DEPENDENCY", AB_ARRAY_OUTER_DEPENDENCY),
>      minit ("MODULE_PROCEDURE", AB_MODULE_PROCEDURE),
>      minit ("OACC_DECLARE_CREATE", AB_OACC_DECLARE_CREATE),
> @@ -2095,7 +2101,6 @@ DECL_MIO_NAME (procedure_type)
>  DECL_MIO_NAME (ref_type)
>  DECL_MIO_NAME (sym_flavor)
>  DECL_MIO_NAME (sym_intent)
> -DECL_MIO_NAME (oacc_function)
>  #undef DECL_MIO_NAME
>  
>  /* Symbol attributes are stored in list with the first three elements
> @@ -2117,8 +2122,6 @@ mio_symbol_attribute (symbol_attribute *attr)
>    attr->proc = MIO_NAME (procedure_type) (attr->proc, procedures);
>    attr->if_source = MIO_NAME (ifsrc) (attr->if_source, ifsrc_types);
>    attr->save = MIO_NAME (save_state) (attr->save, save_status);
> -  attr->oacc_function = MIO_NAME (oacc_function) (attr->oacc_function,
> -						  oacc_function_types);
>  
>    ext_attr = attr->ext_attr;
>    mio_integer ((int *) &ext_attr);
> @@ -2236,6 +2239,16 @@ mio_symbol_attribute (symbol_attribute *attr)
>  	MIO_NAME (ab_attribute) (AB_VTAB, attr_bits);
>        if (attr->omp_declare_target)
>  	MIO_NAME (ab_attribute) (AB_OMP_DECLARE_TARGET, attr_bits);
> +      if (attr->oacc_routine)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE, attr_bits);
> +      if (attr->oacc_routine_gang)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_GANG, attr_bits);
> +      if (attr->oacc_routine_worker)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_WORKER, attr_bits);
> +      if (attr->oacc_routine_vector)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_VECTOR, attr_bits);
> +      if (attr->oacc_routine_seq)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_SEQ, attr_bits);
>        if (attr->array_outer_dependency)
>  	MIO_NAME (ab_attribute) (AB_ARRAY_OUTER_DEPENDENCY, attr_bits);
>        if (attr->module_procedure)
> @@ -2422,6 +2435,21 @@ mio_symbol_attribute (symbol_attribute *attr)
>  	    case AB_OMP_DECLARE_TARGET:
>  	      attr->omp_declare_target = 1;
>  	      break;
> +	    case AB_OACC_ROUTINE:
> +	      attr->oacc_routine = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_GANG:
> +	      attr->oacc_routine_gang = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_WORKER:
> +	      attr->oacc_routine_worker = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_VECTOR:
> +	      attr->oacc_routine_vector = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_SEQ:
> +	      attr->oacc_routine_seq = 1;
> +	      break;
>  	    case AB_ARRAY_OUTER_DEPENDENCY:
>  	      attr->array_outer_dependency =1;
>  	      break;

That seems similar to what my patch is did, albeit with some checking
deferred. I don't think this would maintain backwards compatibility with
object files generated by older versions of gcc.

Regarding backwards compatibility, maybe we should teach gfortran to
default to seq parallelism if an oacc_function attribute is missing in
an older version of the .mod file? I'm not sure if there's anything we
can do about forwards compatibility, i.e., linking a module generated by
gcc7 with gcc6.

> diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
> index 05e4661..5a69e38 100644
> --- gcc/fortran/openmp.c
> +++ gcc/fortran/openmp.c
> @@ -1714,44 +1714,6 @@ gfc_match_oacc_cache (void)
>    return MATCH_YES;
>  }
>  
> -/* Determine the loop level for a routine.  Returns OACC_FUNCTION_NONE if
> -   any error is detected.  */
> -
> -static oacc_function
> -gfc_oacc_routine_dims (gfc_omp_clauses *clauses)
> -{
> -  int level = -1;
> -  oacc_function ret = OACC_FUNCTION_SEQ;
> -
> -  if (clauses)
> -    {
> -      unsigned mask = 0;
> -
> -      if (clauses->gang)
> -	{
> -	  level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_GANG;
> -	}
> -      if (clauses->worker)
> -	{
> -	  level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_WORKER;
> -	}
> -      if (clauses->vector)
> -	{
> -	  level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_VECTOR;
> -	}
> -      if (clauses->seq)
> -	level = GOMP_DIM_MAX, mask |= GOMP_DIM_MASK (level);
> -
> -      if (mask != (mask & -mask))
> -	ret = OACC_FUNCTION_NONE;
> -    }
> -
> -  return ret;
> -}
> -
>  match
>  gfc_match_oacc_routine (void)
>  {
> @@ -1761,7 +1723,8 @@ gfc_match_oacc_routine (void)
>    gfc_omp_clauses *c = NULL;
>    gfc_oacc_routine_name *n = NULL;
>    gfc_intrinsic_sym *isym = NULL;
> -  oacc_function dims = OACC_FUNCTION_NONE;
> +  symbol_attribute *add_attr = NULL;
> +  const char *add_attr_name = NULL;
>  
>    old_loc = gfc_current_locus;
>  
> @@ -1828,19 +1791,26 @@ gfc_match_oacc_routine (void)
>  	  != MATCH_YES))
>      return MATCH_ERROR;
>  
> -  dims = gfc_oacc_routine_dims (c);
> -  if (dims == OACC_FUNCTION_NONE)
> -    {
> -      gfc_error ("Multiple loop axes specified for routine %C");
> -      gfc_current_locus = old_loc;
> -      return MATCH_ERROR;
> -    }
> -
>    if (isym != NULL)
> -    /* There is nothing to do for intrinsic procedures.  */
> -    ;
> +    {
> +      //TODO gfc_intrinsic_sym doesn't have symbol_attribute?
> +      //add_attr = &isym->attr;
> +      //add_attr_name = NULL; //TODO
> +      /* Fake it.  TODO: handle device_type clauses...  */
> +      if (c->gang || c->worker || c->vector)
> +	{
> +	  gfc_error ("Intrinsic symbol specified in !$ACC ROUTINE ( NAME )"
> +		     " at %C, with incompatible clauses specifying the level"
> +		     " of parallelism");
> +	  gfc_current_locus = old_loc;
> +	  return MATCH_ERROR;
> +	}
> +    }
>    else if (sym != NULL)
>      {
> +      add_attr = &sym->attr;
> +      add_attr_name = NULL; //TODO
> +
>        n = gfc_get_oacc_routine_name ();
>        n->sym = sym;
>        n->clauses = NULL;
> @@ -1852,11 +1822,41 @@ gfc_match_oacc_routine (void)
>      }
>    else if (gfc_current_ns->proc_name)
>      {
> -      if (!gfc_add_omp_declare_target (&gfc_current_ns->proc_name->attr,
> -				       gfc_current_ns->proc_name->name,
> -				       &old_loc))
> +      add_attr = &gfc_current_ns->proc_name->attr;
> +      add_attr_name = gfc_current_ns->proc_name->name;
> +    }
> +  else
> +    gcc_unreachable ();
> +
> +  if (add_attr != NULL)
> +    {
> +      if (!gfc_add_omp_declare_target (add_attr, add_attr_name, &old_loc))
>  	goto cleanup;
> -      gfc_current_ns->proc_name->attr.oacc_function = dims;
> +      /* Skip over any existing symbol attributes capturing OpenACC routine
> +	 directives.  */
> +      while (add_attr->next != NULL)
> +	add_attr = add_attr->next;
> +      if (add_attr->oacc_routine)
> +	{
> +	  add_attr->next = XCNEW (symbol_attribute);
> +	  gfc_clear_attr (add_attr->next);
> +	  add_attr = add_attr->next;
> +	}
> +      if (!gfc_add_oacc_routine (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->gang
> +	  && !gfc_add_oacc_routine_gang (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->worker
> +	  && !gfc_add_oacc_routine_worker (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->vector
> +	  && !gfc_add_oacc_routine_vector (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->seq
> +	  && !gfc_add_oacc_routine_seq (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      add_attr->omp_clauses_locus = old_loc; //TODO OK to just assign that?
>      }

This is another stylistic thing I don't like. Instead of having a single
function for mutually exclusive attributes, you need five. And each of
those functions are extremely similar, and I copy and paste issues when
dealing with such functions in the past.

With that in mind, I do see some value in preserving the routine clauses
and location information for device_type. But I thought that device_type
was more of a future project.

Cesar

Attachment: module-test.tar
Description: Unix tar archive


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]