This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING v2][PATCH] Make function clone name numbering independent.


Hi Martin, Richard,

Thanks for your responses.

On 2018-09-03 09:15 AM, Martin Jambor wrote:
> Hi,
> 
> On Mon, Sep 03 2018, Richard Biener wrote:
>> On Mon, Sep 3, 2018 at 12:02 PM Martin Jambor <mjambor@suse.cz> wrote:
>>>
>>> Hi,
>>>
>>> On Fri, Aug 31 2018, Michael Ploujnikov wrote:
>>>> I've done some more digging into the current uses of
>>>> numbered_clone_function_name and checked if any tests fail if I change
>>>> it to suffixed_function_name:
>>>>
>>>>   - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, "artificial_thunk");
>>>>     - no new tests fail, inconclusive
>>>>     - and despite the comment on redirect_callee_duplicating_thunks
>>>>       about "one or more" duplicates it doesn't seem like
>>>>       duplicate_thunk_for_node would be called more than once for each
>>>>       node, assuming each node is named uniquely, but I'm far from an
>>>>       expert in this area
>>>
>>> The comment means that if there is a chain of thunks, the method clones
>>> all of them.  Nevertheless, you need name numbering here for the same
>>> reason why you need them for constprop.
>>
>> The remaining question I had with the patch was if maybe all callers
>> can handle assigning
>> the numbering themselves, thus do sth like
>>
>>    for-each-clone-of (i, fn)
>>       DECL_NAME (...) = numbered_clone_function_name (...,
>> "artificial_thunk", i);
>>
>> which would make the map of name -> number unnecessary.

Please see my comment below.

>>
> 
> That is what I would prefer too but it also has downsides.  Callers
> would have to do their book keeping and the various cloning interfaces
> would get another parameter when already they have quite a few
> (introducing some cloning context classes seems like an overkill to me
> :-).

I'm fine with doing it the way you guys are suggesting, but just to
explain my original thinking: The way I see it is that there will
always be some kind of name -> number mapping, even if it's managed by
each individual user and even if it's actually cgraph_node ->
number. Needless to say I'm far from an expert in all of the involved
areas (this is my first contribution to GCC) so the most effective
approach I could come up is doing it all in one place. Now with your
expert advice and as I get a better understanding of how things like
sra and other clones are created, I'm starting to see that a more
tailored approach is possible and probably the way it should have been
done in the first place.

> 
> If we want to get rid of .constprop discrepancies,

Could you please clarify to me what exactly you mean by the
".constprop discrepancies"?

> something like the
> following (only very mildly tested) patch should be enough (note that
> IPA-CP currently does not really need the new has because it creates

What do you mean "it doesn't need the new hash"? My
independent-cloneids-1.c test shows that a function can have more than
one .constprop clone, but I'm probably just not understanding
something.

> clones in only one pass through the call graph, but that is something
> likely to change).  We could then adjust other cloners that we care
> about.
> 
> However, if clones of thunks are one of them, then the optional
> parameter will also proliferate to cgraph_node::create_clone(),
> cgraph_edge::redirect_callee_duplicating_thunks() and
> duplicate_thunk_for_node().  create_version_clone_with_body would almost
> certainly need it because of IPA-SRA too (IPA-SRA can of course always
> pass zero).

I'll try to do this and to convert the other users in the next version
of the patch, but I might need some help with understanding how some
of those passes work!

> 
> Martin
> 
> 
> 
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 2b00f0165fa..703c3cfea7b 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -964,11 +964,15 @@ public:
>  			     cgraph_node *new_inlined_to,
>  			     bitmap args_to_skip, const char *suffix = NULL);
>  
> -  /* Create callgraph node clone with new declaration.  The actual body will
> -     be copied later at compilation stage.  */
> +  /* Create callgraph node clone with new declaration.  The actual body will be
> +     copied later at compilation stage.  The name of the new clone will be
> +     constructed from the name of the original name, SUFFIX and a number which
> +     can either be NUM_SUFFIX if non-negative or a unique incremented integer
> +     otherwise.  */
>    cgraph_node *create_virtual_clone (vec<cgraph_edge *> redirect_callers,
>  				     vec<ipa_replace_map *, va_gc> *tree_map,
> -				     bitmap args_to_skip, const char * suffix);
> +				     bitmap args_to_skip, const char * suffix,
> +				     int num_suffix = -1);
>  
>    /* cgraph node being removed from symbol table; see if its entry can be
>     replaced by other inline clone.  */
> @@ -2376,8 +2380,9 @@ basic_block init_lowered_empty_function (tree, bool, profile_count);
>  tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree);
>  /* In cgraphclones.c  */
>  
> -tree clone_function_name_1 (const char *, const char *);
> -tree clone_function_name (tree decl, const char *);
> +tree clone_function_name_1 (const char *name, const char *suffix,
> +			    int num_suffix = -1);
> +tree clone_function_name (tree decl, const char *suffix, int num_suffix = -1);

I can see how using the default -1 helped you keep this patch very
small and simple, but for my version of the patch I would actually
prefer to avoid the implicit fallback to the original global counter
and instead make the patch as big as necessary to convert all the
users who need it. The motivation is that implicit behaviour can
sometimes be missed, which I believe is related to why some existing
calls to clone_function_name are unnecessary.

On this note, what do you think about my idea of renaming the function
to numbered_clone_function_name and adding a suffixed_function_name to
make a distinction for users who don't care about numbering?

>  
>  void tree_function_versioning (tree, tree, vec<ipa_replace_map *, va_gc> *,
>  			       bool, bitmap, bool, bitmap, basic_block);
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index 0c0a94b04a3..865c3fa1ad0 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -513,11 +513,13 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
>  
>  static GTY(()) unsigned int clone_fn_id_num;
>  
> -/* Return a new assembler name for a clone with SUFFIX of a decl named
> -   NAME.  */
> +/* Return a new assembler name for a clone with SUFFIX of a decl named NAME
> +   Apart from, string SUFFIX, the new name will end with either NUM_SIFFIX, if
> +   non-negative, or a unique unspecified number otherwise.  .  */
>  
>  tree
> -clone_function_name_1 (const char *name, const char *suffix)
> +clone_function_name_1 (const char *name, const char *suffix,
> +		       int num_suffix)
>  {
>    size_t len = strlen (name);
>    char *tmp_name, *prefix;
> @@ -526,22 +528,32 @@ clone_function_name_1 (const char *name, const char *suffix)
>    memcpy (prefix, name, len);
>    strcpy (prefix + len + 1, suffix);
>    prefix[len] = symbol_table::symbol_suffix_separator ();
> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
> +  unsigned int num;
> +  if (num_suffix < 0)
> +    num = clone_fn_id_num++;
> +  else
> +    num = (unsigned) num_suffix;
> +  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, num);
>    return get_identifier (tmp_name);
>  }
>  
> -/* Return a new assembler name for a clone of DECL with SUFFIX.  */
> +/* Return a new assembler name for a clone of DECL with SUFFIX. Apart from,
> +   string SUFFIX, the new name will end with either NUM_SIFFIX, if
> +   non-negative, or a unique unspecified number otherwise.  */
>  
>  tree
> -clone_function_name (tree decl, const char *suffix)
> +clone_function_name (tree decl, const char *suffix, int num_suffix)
>  {
>    tree name = DECL_ASSEMBLER_NAME (decl);
> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
> +  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix, num_suffix);
>  }
>  
>  
> -/* Create callgraph node clone with new declaration.  The actual body will
> -   be copied later at compilation stage.
> +/* Create callgraph node clone with new declaration.  The actual body will be
> +   copied later at compilation stage.  The name of the new clone will be
> +   constructed from the name of the original name, SUFFIX and a number which
> +   can either be NUM_SUFFIX if non-negative or a unique incremented integer
> +   otherwise.
>  
>     TODO: after merging in ipa-sra use function call notes instead of args_to_skip
>     bitmap interface.
> @@ -549,7 +561,8 @@ clone_function_name (tree decl, const char *suffix)
>  cgraph_node *
>  cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
>  				   vec<ipa_replace_map *, va_gc> *tree_map,
> -				   bitmap args_to_skip, const char * suffix)
> +				   bitmap args_to_skip, const char * suffix,
> +				   int num_suffix)
>  {
>    tree old_decl = decl;
>    cgraph_node *new_node = NULL;
> @@ -584,7 +597,8 @@ cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
>    strcpy (name + len + 1, suffix);
>    name[len] = '.';
>    DECL_NAME (new_decl) = get_identifier (name);
> -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix));
> +  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix,
> +							  num_suffix));
>    SET_DECL_RTL (new_decl, NULL);
>  
>    new_node = create_clone (new_decl, count, false,
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index afc45969558..52690996419 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -376,6 +376,9 @@ static profile_count max_count;
>  
>  static long overall_size, max_new_size;
>  
> +/* Hash to give different UID suffixes */
> +static hash_map<cgraph_node *, int> *clone_num_suffixes;
> +
>  /* Return the param lattices structure corresponding to the Ith formal
>     parameter of the function described by INFO.  */
>  static inline struct ipcp_param_lattices *
> @@ -3850,8 +3853,11 @@ create_specialized_node (struct cgraph_node *node,
>  	}
>      }
>  
> +  int &suffix_counter = clone_num_suffixes->get_or_insert(node);
>    new_node = node->create_virtual_clone (callers, replace_trees,
> -					 args_to_skip, "constprop");
> +					 args_to_skip, "constprop",
> +					 suffix_counter);
> +  suffix_counter++;
>  
>    bool have_self_recursive_calls = !self_recursive_calls.is_empty ();
>    for (unsigned j = 0; j < self_recursive_calls.length (); j++)
> @@ -5065,6 +5071,7 @@ ipcp_driver (void)
>  
>    ipa_check_create_node_params ();
>    ipa_check_create_edge_args ();
> +  clone_num_suffixes = new hash_map<cgraph_node *, int>;
>  
>    if (dump_file)
>      {
> @@ -5086,6 +5093,7 @@ ipcp_driver (void)
>    ipcp_store_vr_results ();
>  
>    /* Free all IPCP structures.  */
> +  delete clone_num_suffixes;

I like how this allows the reduction of the lifetime of the mapping!
Richard: that helps with your concerns about the extra memory usage,
right?


>    free_toporder_info (&topo);
>    delete edge_clone_summaries;
>    edge_clone_summaries = NULL;
> 


Thanks again,
- Michael

Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]