This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Next set of OpenACC changes: C family


On 05/05/2015 07:18 AM, Jakub Jelinek wrote:
> On Tue, May 05, 2015 at 10:57:28AM +0200, Thomas Schwinge wrote:
>> --- gcc/c-family/c-common.c
>> +++ gcc/c-family/c-common.c
>> @@ -809,7 +809,7 @@ const struct attribute_spec c_common_attribute_table[] =
>>  			      handle_omp_declare_simd_attribute, false },
>>    { "cilk simd function",     0, -1, true,  false, false,
>>  			      handle_omp_declare_simd_attribute, false },
>> -  { "omp declare target",     0, 0, true, false, false,
>> +  { "omp declare target",     0, -1, true, false, false,
>>  			      handle_omp_declare_target_attribute, false },
>>    { "alloc_align",	      1, 1, false, true, true,
>>  			      handle_alloc_align_attribute, false },
> 
> Can you explain this change?  "omp declare target" doesn't take any
> arguments, so "0, 0," looks right to me.

Because we are using that attribute for oacc routines, and routines may
have contain clauses.

Thinking about this some more, we could probably revert this change. I
have another patch to disable exception handling inside openacc
accelerated regions because the nvptx target doesn't support them. In
that patch I introduced a new "oacc function" attribute. Maybe we should
attach the acc routine clauses on that "oacc function" attribute.

>> @@ -823,6 +823,7 @@ const struct attribute_spec c_common_attribute_table[] =
>>  			      handle_bnd_legacy, false },
>>    { "bnd_instrument",         0, 0, true, false, false,
>>  			      handle_bnd_instrument, false },
>> +  { "oacc declare",           0, -1, true,  false, false, NULL, false },
>>    { NULL,                     0, 0, false, false, false, NULL, false }
> 
> If "oacc declare" is different, then supposedly you shouldn't reuse
> "omp declare target" attribute for the OpenACC thingie.

I'm not sure about this one. Oacc has enough quirks where it may be
justifiable though. I'll find out who wrote this patch.

>> --- gcc/c-family/c-omp.c
>> +++ gcc/c-family/c-omp.c
>> @@ -1087,3 +1087,108 @@ c_omp_predetermined_sharing (tree decl)
>>  
>>    return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
>>  }
>> +
>> +/* Return a numerical code representing the device_type.  Currently,
>> +   only device_type(nvidia) is supported.  All device_type parameters
>> +   are treated as case-insensitive keywords.  */
>> +
>> +int
>> +oacc_extract_device_id (const char *device)
>> +{
>> +  if (!strcasecmp (device, "nvidia"))
>> +    return GOMP_DEVICE_NVIDIA_PTX;
>> +  return GOMP_DEVICE_NONE;
>> +}
> 
> Why do you support just one particular device_type?  That sounds broken.
> You should just have some table with names <-> GOMP_DEVICE_* mappings.

I kind of wanted to keep this patch local in gomp-4_0-branch until it
was a little more functional. Adding proper support for device_type is
going to be more involved. For instance, the the tile clause changes the
shape of a loop, so if you have

  #pragma acc loop tile (2, 4) device_type (nvidia) tile (5, 5) \
     device_type (something_else) tile (1, 4)

we're going to have to generate three different versions of that
parallel region. Then we'd have to teach the compiler to the offload
regions with the proper number of gangs, workers and vectors, etc.

For our initial implementation, we just decided to support device_type
(nvidia), since openacc is really only working on nvptx and host
devices. And the runtime is rigged up to ignore num_gangs, num_workers
and vector_length for the host anyway. So that's why I filtered out the
device_type clauses in the front end.

Also, for full disclosure, we're parsing the tile clause, but we're not
actually tiling the loops yet. We're still in the process of getting the
oacc execution model working on the nvptx target. Things which are
"easy" to do in cpu threads (barriers and synchronization, global
memory, etc.) are not as straightforward on gpus, unfortunately.

>> +	  if (code & (1 << GOMP_DEVICE_NVIDIA_PTX))
>> +	    {
>> +	      if (seen_nvidia)
>> +		{
>> +		  seen_nvidia = NULL_TREE;
>> +		  error_at (OMP_CLAUSE_LOCATION (c),
>> +			    "duplicate device_type (nvidia)");
>> +		  goto filter_error;
>> +		}
>> +	      else
>> +		seen_nvidia = OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
> 
> Again, I must say I don't like the hardcoding of one particular
> device type here.
> Doesn't Intel want to support OpenACC for XeonPhi?  What about HSA
> eventually, etc.?
> 
>> @@ -4624,7 +4657,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
>>  	  last_label = false;
>>  	  mark_valid_location_for_stdc_pragma (false);
>>  	  c_parser_declaration_or_fndef (parser, true, true, true, true,
>> -					 true, NULL, vNULL);
>> +					 true, NULL, vNULL, NULL_TREE, false);
> 
> Wouldn't default arguments be in order here?  Though, even those will mean
> compile time cost of passing all the zeros almost all the time.

I'll check who did this.

>> -/* OpenMP 2.5:
>> +/* OpenACC:
>> +   num_gangs ( expression )
>> +   num_workers ( expression )
>> +   vector_length ( expression )
>> +
>> +   OpenMP 2.5:
>>     num_threads ( expression ) */
>>  
>>  static tree
>> -c_parser_omp_clause_num_threads (c_parser *parser, tree list)
>> +c_parser_omp_positive_int_clause (c_parser *parser, pragma_omp_clause c_kind,
>> +				  const char *str, tree list)
>>  {
>> -  location_t num_threads_loc = c_parser_peek_token (parser)->location;
>> -  if (c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
>> +  omp_clause_code kind;
>> +  switch (c_kind)
> 
> This is undesirable, to add new clauses to the same handler you'd need
> to add them both in the caller and to this switch.  Perhaps pass
> omp_clause_code kind argument instead of pragma_omp_clause c_kind?
> 
>>  static tree
>> -c_parser_omp_clause_num_workers (c_parser *parser, tree list)
>> +c_parser_oacc_shape_clause (c_parser *parser, pragma_omp_clause c_kind,
>> +			    const char *str, tree list)
>>  {
>> -  location_t num_workers_loc = c_parser_peek_token (parser)->location;
>> -  if (c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
>> +  omp_clause_code kind;
>> +  const char *id = "num";
>> +
>> +  switch (c_kind)
> 
> Likewise.
> 
>> +/* Split the 'clauses' into a set of 'loop' clauses and a set of
>> +   'not-loop' clauses.  */
>>  
>>  static tree
>> -c_parser_oacc_kernels (location_t loc, c_parser *parser, char *p_name)
>> +oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses)
> 
> Is this really C specific?  I mean, for OpenMP I'm sharing the clause
> splitting code between C and C++ FEs in c-omp.c.

Probably not. C++ support was added late. We'll clean up this patch.

Cesar


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]