This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: simd enabled functions (omp declare simd / elementals)


Hi,

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> Hello gentlemen.  I'm CCing all of you, because each of you can
> provide valuable feedback to various parts of the compiler which I
> touch.  I have sprinkled love notes with your names throughout the
> post :).

sorry it took me so long, for various reasons I out of my control I've
accumulated quite a backlog of email and tasks last week and it took
me a lot of time to chew through it all.

...

> [Martin] I have adapted the ipa_parm_adjustment infrastructure to
> allow adding new arguments out of the blue like you mentioned was
> missing in ipa-prop.h.  I have also added support for creating
> vectors of arguments.  Could you take a look at my changes to
> ipa-prop.[ch]?

Sure, though I have only looked at ipa-* and tree-sra.c stuff.  I do
not have any real objections but would suggest a few amendments.  

I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too.  Can you please at least
put a "not implemented" assert there?  (The reason is that the plan
still is to replace args_to_skip bitmaps in cgraphclones.c by
adjustments one day and we do need to combine clones.)

> 
> [Martin] I need to add new arguments in the case of inbranch clones,
> which add an additional vector with a mask as the last argument:
> For the following:
> 
> #pragma omp declare simd simdlen(4) inbranch
> int foo (int a)
> {
>   return a + 1234;
> }
> 
> ...we would generate a clone with:
> 
> vector(4) int
> foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
> 
> I thought it best to enhance ipa_modify_formal_parameters() and
> associated machinery than to add the new argument ad-hoc.  We
> already have enough ways of doing tree and cgraph versioning in the
> compiler ;-).
> 

...

> gcc/ChangeLog.elementals
> 
> 	* Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
> 	* tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when
> 	a SIMD clone may be available.
> 	(vectorizable_function): Use SIMD clone if available.
> 	* ipa-cp.c (determine_versionability): Nodes with SIMD clones are
> 	not versionable.
> 	* ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
> 	* cgraph.h (enum linear_stride_type): New.
> 	(struct simd_clone_arg): New.
> 	(struct simd_clone): New.
> 	(struct cgraph_node): Add simdclone and simdclone_of fields.
> 	(get_simd_clone): Protoize.
> 	* cgraph.c (get_simd_clone): New.
> 	Add `has_simd_clones' field.
> 	* ipa-cp.c (determine_versionability): Disallow functions with
> 	simd clones.

(This looks like a repeated entry.)

> 	* ipa-prop.h (ipa_sra_modify_function_body): Protoize.
> 	(sra_ipa_modify_expr): Same.
> 	(struct ipa_parm_adjustment): Add new_arg_prefix and new_param
> 	fields.  Document their use.
> 	* ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
> 	new parameters and minor cleanups.
> 	* omp-low.c: Add new pass_omp_simd_clone support code.
> 	(make_pass_omp_simd_clone): New.
> 	(pass_data_omp_simd_clone): Declare.
> 	(class pass_omp_simd_clone): Declare.
> 	(vecsize_mangle): New.
> 	(ipa_omp_simd_clone): New.
> 	(simd_clone_clauses_extract): New.
> 	(simd_clone_compute_base_data_type): New.
> 	(simd_clone_compute_vecsize_and_simdlen): New.
> 	(simd_clone_create): New.
> 	(simd_clone_adjust_return_type): New.
> 	(simd_clone_adjust_return_types): New.
> 	(simd_clone_adjust): New.
> 	(simd_clone_init_simd_arrays): New.
> 	(ipa_simd_modify_function_body): New.
> 	(simd_clone_mangle): New.
> 	(simd_clone_struct_alloc): New.
> 	(simd_clone_struct_copy): New.
> 	(class argno_map): New.
> 	(argno_map::argno_map(tree)): New.
> 	(argno_map::~argno_map): New.
> 	(argno_map::operator []): New.
> 	(argno_map::length): New.
> 	(expand_simd_clones): New.
> 	(create_tmp_simd_array): New.
> 	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
> 	* tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
> 	* tree-pass.h (make_pass_omp_simd_clone): New.
> 	* passes.def (pass_omp_simd_clone): New.
> 	* target.def: Define new hook prefix "TARGET_CILKPLUS_".
> 	(default_vecsize_mangle): New.
> 	(vecsize_for_mangle): New.
> 	* doc/tm.texi.in: Add placeholder for
> 	TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
> 	TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
> 	* tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
> 	(ipa_sra_modify_function_body): Same.
> 	* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
> 	* doc/tm.texi: Regenerate.
> 	* config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
> 	(ix86_cilkplus_vecsize_for_mangle): New.
> 	(TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New.
> 	(TARGET_CILKPLUS_VECSIZE_FOR_MANGLE): New.
> 

...

> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index c38ba82..faae080 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
>      reason = "not a tree_versionable_function";
>    else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
>      reason = "insufficient body availability";
> +  else if (node->has_simd_clones)
> +    {
> +      /* Ideally we should clone the SIMD clones themselves and create
> +	 vector copies of them, so IPA-cp and SIMD clones can happily
> +	 coexist, but that may not be worth the effort.  */
> +      reason = "function has SIMD clones";
> +    }

Lets hope we will eventually fix this in some followup :-)


>  
>    if (reason && dump_file && !node->symbol.alias && !node->thunk.thunk_p)
>      fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n",
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 2fbc9d4..0c20dc6 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -3361,24 +3361,18 @@ void
>  ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  			      const char *synth_parm_prefix)
>  {
> -  vec<tree> oparms, otypes;
> -  tree orig_type, new_type = NULL;
> -  tree old_arg_types, t, new_arg_types = NULL;
> -  tree parm, *link = &DECL_ARGUMENTS (fndecl);
> -  int i, len = adjustments.length ();
> -  tree new_reversed = NULL;
> -  bool care_for_types, last_parm_void;
> -
>    if (!synth_parm_prefix)
>      synth_parm_prefix = "SYNTH";
>  
> -  oparms = ipa_get_vector_of_formal_parms (fndecl);
> -  orig_type = TREE_TYPE (fndecl);
> -  old_arg_types = TYPE_ARG_TYPES (orig_type);
> +  vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl);
> +  tree orig_type = TREE_TYPE (fndecl);
> +  tree old_arg_types = TYPE_ARG_TYPES (orig_type);
>  
>    /* The following test is an ugly hack, some functions simply don't have any
>       arguments in their type.  This is probably a bug but well... */
> -  care_for_types = (old_arg_types != NULL_TREE);
> +  bool care_for_types = (old_arg_types != NULL_TREE);
> +  bool last_parm_void;
> +  vec<tree> otypes;
>    if (care_for_types)
>      {
>        last_parm_void = (TREE_VALUE (tree_last (old_arg_types))
> @@ -3395,13 +3389,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>        otypes.create (0);
>      }
>  
> -  for (i = 0; i < len; i++)
> +  int len = adjustments.length ();
> +  tree *link = &DECL_ARGUMENTS (fndecl);
> +  tree new_arg_types = NULL;
> +  for (int i = 0; i < len; i++)
>      {
>        struct ipa_parm_adjustment *adj;
>        gcc_assert (link);
>  
>        adj = &adjustments[i];
> -      parm = oparms[adj->base_index];
> +      tree parm;
> +      if (adj->new_param)

I don't know what I was thinking when I invented copy_param and
remove_param as multiple flags rather than a single enum, I probably
wasn't thinking at all.  I can change it myself as a followup if you
have more pressing tasks now.  Meanwhile, can you gcc_checking_assert
that at most one flag is set at appropriate places?

> +	parm = NULL;
> +      else
> +	parm = oparms[adj->base_index];
>        adj->base = parm;

I do not think it makes sense for new parameters to have a base which
is basically the old decl.  Do you have any reasons for not setting it
to NULL?

>  
>        if (adj->copy_param)
> @@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  	  tree new_parm;
>  	  tree ptype;
>  

> -	  if (adj->by_ref)
> -	    ptype = build_pointer_type (adj->type);

Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0)
here...

> +	  if (adj->simdlen)
> +	    {
> +	      /* If we have a non-null simdlen but by_ref is true, we
> +		 want a vector of pointers.  Build the vector of
> +		 pointers here, not a pointer to a vector in the
> +		 adj->by_ref case below.  */
> +	      ptype = build_vector_type (adj->type, adj->simdlen);
> +	    }
> +	  else if (adj->by_ref)

...or remove this else and be able to build a pointer to the vector
if by_ref is true.

> +	    {
> +	      ptype = build_pointer_type (adj->type);
> +	    }
>  	  else
>  	    ptype = adj->type;
>  
> @@ -3427,8 +3438,9 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  
>  	  new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
>  				 ptype);
> -	  DECL_NAME (new_parm) = create_tmp_var_name (synth_parm_prefix);
> -
> +	  const char *prefix
> +	    = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;

Can we perhaps get rid of synth_parm_prefix then and just have
adj->new_arg_prefix?  It's not particularly important but this is
weird.


> +	  DECL_NAME (new_parm) = create_tmp_var_name (prefix);
>  	  DECL_ARTIFICIAL (new_parm) = 1;
>  	  DECL_ARG_TYPE (new_parm) = ptype;
>  	  DECL_CONTEXT (new_parm) = fndecl;
> @@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  	  DECL_IGNORED_P (new_parm) = 1;
>  	  layout_decl (new_parm, 0);
>  
> -	  adj->base = parm;
> +	  if (adj->new_param)
> +	    adj->base = new_parm;

Again, shouldn't this be NULL?

> +	  else
> +	    adj->base = parm;
>  	  adj->reduction = new_parm;
>  
>  	  *link = new_parm;
> -
>  	  link = &DECL_CHAIN (new_parm);
>  	}
>      }
>  
>    *link = NULL_TREE;
>  
> +  tree new_reversed = NULL;
>    if (care_for_types)
>      {
>        new_reversed = nreverse (new_arg_types);
> @@ -3464,6 +3479,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>       Exception is METHOD_TYPEs must have THIS argument.
>       When we are asked to remove it, we need to build new FUNCTION_TYPE
>       instead.  */
> +  tree new_type = NULL;
>    if (TREE_CODE (orig_type) != METHOD_TYPE
>         || (adjustments[0].copy_param
>  	  && adjustments[0].base_index == 0))
> @@ -3489,7 +3505,7 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments,
>  
>    /* This is a new type, not a copy of an old type.  Need to reassociate
>       variants.  We can handle everything except the main variant lazily.  */
> -  t = TYPE_MAIN_VARIANT (orig_type);
> +  tree t = TYPE_MAIN_VARIANT (orig_type);
>    if (orig_type != t)
>      {
>        TYPE_MAIN_VARIANT (new_type) = t;
> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index 48634d2..8d7d9b9 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -634,9 +634,10 @@ struct ipa_parm_adjustment
>       arguments.  */
>    tree alias_ptr_type;
>  
> -  /* The new declaration when creating/replacing a parameter.  Created by
> -     ipa_modify_formal_parameters, useful for functions modifying the body
> -     accordingly. */
> +  /* The new declaration when creating/replacing a parameter.  Created
> +     by ipa_modify_formal_parameters, useful for functions modifying
> +     the body accordingly.  For brand new arguments, this is the newly
> +     created argument.  */
>    tree reduction;

We should eventually rename this to new_decl or something, given that
this is not an SRA thing any more.  But that can be done later.

>  
>    /* New declaration of a substitute variable that we may use to replace all
> @@ -647,15 +648,36 @@ struct ipa_parm_adjustment
>       is NULL), this is going to be its nonlocalized vars value.  */
>    tree nonlocal_value;
>  
> +  /* If this is a brand new argument, this holds the prefix to be used
> +     for the DECL_NAME.  */
> +  const char *new_arg_prefix;
> +
>    /* Offset into the original parameter (for the cases when the new parameter
>       is a component of an original one).  */
>    HOST_WIDE_INT offset;
>  
> -  /* Zero based index of the original parameter this one is based on.  (ATM
> -     there is no way to insert a new parameter out of the blue because there is
> -     no need but if it arises the code can be easily exteded to do so.)  */
> +  /* Zero based index of the original parameter this one is based on.  */
>    int base_index;
>  
> +  /* If non-null, the parameter is a vector of `type' with this many
> +     elements.  */
> +  int simdlen;
> +
> +  /* This is a brand new parameter.
> +
> +     For new parameters, base_index must be >= the number of
> +     DECL_ARGUMENTS in the function.  That is, new arguments will be
> +     the last arguments in the adjusted function.
> +
> +     ?? Perhaps we could redesign ipa_modify_formal_parameters() to
> +     reorganize argument position, thus allowing inserting of brand
> +     new arguments anywhere, but there is no use for this now.

Where does this requirement come from?  At least at the moment I
cannot see why ipa_modify_formal_parameters wouldn't be able to
reorder parameters as it is?  What breaks if base_index of adjustments
for new parameters has zero or a nonsensical value?

> +
> +     Also, `type' should be set to the new type, `new_arg_prefix'
> +     should be set to the string prefix for the new DECL_NAME, and
> +     `reduction' will ultimately hold the newly created argument.  */
> +  unsigned new_param : 1;
> +
>    /* This new parameter is an unmodified parameter at index base_index. */
>    unsigned copy_param : 1;
>  
> @@ -697,5 +719,7 @@ void ipa_dump_param (FILE *, struct ipa_node_params *info, int i);
>  /* From tree-sra.c:  */
>  tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, tree,
>  			   gimple_stmt_iterator *, bool);
> +bool ipa_sra_modify_function_body (ipa_parm_adjustment_vec);
> +bool sra_ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec);
>  

Hm, if you can directly use these, I really think you should rename
them somehow so that their names do not contain SRA and move them to
ipa-prop.c.

Thanks for reviving this slightly moribund infrastructure and sorry
again for the delay,

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]