RFA (cgraph): C++ 'structor decloning patch, Mark III

Jan Hubicka hubicka@ucw.cz
Tue Dec 10 09:48:00 GMT 2013


> This all started with Stuart Hastings' original decloning patch way
> back in 2002:
>   http://gcc.gnu.org/ml/gcc-patches/2002-08/msg00354.html
> Bill Maddox tried to revive it in 2007:
>   http://gcc.gnu.org/ml/gcc-patches/2007-11/msg01147.html
> 
> I'm embarrassed that it has taken so long to get this in.  The main
> source of concern in the past was ABI issues.  But now that we have
> COMDAT groups and precedent for putting multiple [cd]tors into the
> same group for aliasing, I see a solution: make the unified 'tor
> internal to the comdat, so the ABI of the comdat is no different
> from the cloned case.
> 
> I had to change various things in cgraph/ipa in order to support the
> notion of a comdat-local symbol which can only be referenced from
> within that comdat, which is what I'm looking for feedback/approval
> for.  The change to can_refer_decl_in_current_unit_p is not actually
> necessary, but seemed worth adding for possible future use of this
> functionality.
> 
> When decloning is turned on it is done regardless of the size of the
> 'tor; we don't need to consider the size because the inliner will
> clean everything up for us.  If the unified function is small enough
> to inline into the thunks, it will do so and then the thunks can
> themselves be inlined into their callers; otherwise, inlining of the
> thunks is prevented.
> 
> The patch enables decloning with -Os, or in the rare case when
> cloning breaks semantics (i.e. in the presence of the GNU label
> address extension, as in the testcase).
> 
> OK for trunk?

> commit 631d568491a3f4581e7067885a07a6096d06bf02
> Author: Jason Merrill <jason@redhat.com>
> Date:   Thu Jan 12 14:04:42 2012 -0500
> 
>     	PR c++/41090
>     	Add -fdeclone-ctor-dtor.
>     gcc/cp/
>     	* optimize.c (can_alias_cdtor, populate_clone_array): Split out
>     	from maybe_clone_body.
>     	(maybe_thunk_body): New function.
>     	(maybe_clone_body): Call it.
>     	* mangle.c (write_mangled_name): Remove code to suppress
>     	writing of mangled name for cloned constructor or destructor.
>     	(write_special_name_constructor): Handle decloned constructor.
>     	(write_special_name_destructor): Handle decloned destructor.
>     	* method.c (trivial_fn_p): Handle decloning.
>     	* semantics.c (expand_or_defer_fn_1): Clone after setting linkage.
>     gcc/c-family/
>     	* c-common.c, c-common.h (flag_declone_ctor_dtor): New variable.
>     	* c.opt: Add -fdeclone-ctor-dtor.
>     	* c-opts.c (c_common_handle_option): Likewise.
>     	(c_common_post_options): Default to on iff -Os.
>     gcc/
>     	* cgraph.h (comdat_local_p): New.
>     	* cgraph.c (verify_cgraph_node): Make sure we don't call a
>     	comdat-local function from outside its comdat.
>     	* gimple-fold.c (can_refer_decl_in_current_unit_p): Check
>     	references to comdat-local symbols.
>     	* ipa-cp.c (decide_about_value): Don't clone comdat-local fns.
>     	* ipa-inline.c (calls_comdat_local_p): New.
>     	(can_inline_edge_p): Check it.
>     	* ipa.c (symtab_remove_unreachable_nodes): Ignore comdat-local
>     	symbols when marking everything in the group as reachable.
>     	(function_and_variable_visibility): Handle comdat-local fns.
>     include/
>     	* demangle.h (enum gnu_v3_ctor_kinds):
>     	Added literal gnu_v3_unified_ctor.
>     	(enum gnu_v3_ctor_kinds):
>     	Added literal gnu_v3_unified_dtor.
>     libiberty/
>     	* cp-demangle.c (cplus_demangle_fill_ctor,cplus_demangle_fill_dtor):
>     	Handle unified ctor/dtor.
>     	(d_ctor_dtor_name): Handle unified ctor/dtor.
> 

> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index 009a165..0854b95 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -2664,10 +2664,20 @@ verify_cgraph_node (struct cgraph_node *node)
>  	  error_found = true;
>  	}
>      }
> +  tree required_group = NULL_TREE;
> +  if (comdat_local_p (node))
> +    required_group = DECL_COMDAT_GROUP (node->decl);
>    for (e = node->callers; e; e = e->next_caller)
>      {
>        if (verify_edge_count_and_frequency (e))
>  	error_found = true;
> +      if (required_group
> +	  && DECL_COMDAT_GROUP (e->caller->decl) != required_group)
> +	{
> +	  error ("comdat-local function called by %s outside its comdat",
> +		 identifier_to_locale (e->caller->name ()));
> +	  error_found = true;
> +	}

You probably want to add same check into verify_symtab_base for referneces
(i.e. walk all references to function/variable and verify that these are in the same
group).
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 15719fb..b791811 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -1390,4 +1390,14 @@ symtab_can_be_discarded (symtab_node *node)
>  	      && node->resolution != LDPR_PREVAILING_DEF_IRONLY
>  	      && node->resolution != LDPR_PREVAILING_DEF_IRONLY_EXP));
>  }
> +
> +/* Return true if NODE is local to a particular COMDAT group, and must not
> +   be named from outside the COMDAT.  This is used for C++ decloned
> +   constructors.  */
> +
> +static inline bool
> +comdat_local_p (symtab_node *node)
> +{
> +  return (node->same_comdat_group && !TREE_PUBLIC (node->decl));
> +}

Until we turn it into methods, lets stick with current practice of having
symtab_ prefixes for the API...

The case where I played with local comdats was the following.
I made cgraph_body_availability to get context argument (i.e. instead of saying
if something is available in current unit, it was saying if it is available
in current function/variable).

If two symbols are in the same comdat group and refering each other, they are
available even though they may seem overwritable to others. I then started to
produce local symbols for those local references that are equivalent to your comdat
local.

We probably want to get in this extension too. (I did not get data on how often
it fires, but it does i.e. for recursive calls of C++ inlines).
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index d0fa8db..96c889e 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -3318,6 +3318,10 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset,
>  					    &caller_count))
>      return false;
>  
> +  /* Don't clone decls local to a comdat group.  */
> +  if (comdat_local_p (node))
> +    return false;

Why? It seems this should work if the clone becomes another comdat local?

On the other hand, I think you want to prevent ipa-cp propagating addresses of comdat
ocals out of the function. (i.e. make it to check can_refer predicate for its subtitutions)
> +
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
>        fprintf (dump_file, " - considering value ");
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index fbb63da..aa49bfe 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -230,6 +230,25 @@ report_inline_failed_reason (struct cgraph_edge *e)
>      }
>  }
>  
> +/* True iff NODE calls another function which is local to its comdat
> +   (i.e. C++ decloned constructor); in that case, calls to NODE cannot be
> +   inlined, as that would cause a reference from outside the comdat to the
> +   local symbol.  If we inline the call from NODE to the local function,
> +   then inlining NODE can succeed.  */
> +
> +static bool
> +calls_comdat_local_p (struct cgraph_node *node)
> +{
> +  if (!node->same_comdat_group)
> +    return false;
> +  for (struct cgraph_edge *e = node->callees; e; e = e->next_callee)
> +    {
> +      if (comdat_local_p (e->callee))
> +	return true;
> +    }
> +  return false;
> +}
> +
>  /* Decide if we can inline the edge and possibly update
>     inline_failed reason.  
>     We check whether inlining is possible at all and whether
> @@ -237,7 +256,7 @@ report_inline_failed_reason (struct cgraph_edge *e)
>  
>     if REPORT is true, output reason to the dump file.  
>  
> -   if DISREGARD_LIMITES is true, ignore size limits.*/
> +   if DISREGARD_LIMITS is true, ignore size limits.*/
>  
>  static bool
>  can_inline_edge_p (struct cgraph_edge *e, bool report,
> @@ -267,6 +286,11 @@ can_inline_edge_p (struct cgraph_edge *e, bool report,
>        e->inline_failed = CIF_BODY_NOT_AVAILABLE;
>        inlinable = false;
>      }
> +  else if (calls_comdat_local_p (callee))

So we should check here if both caller and callee are in the same group and allow
inlining then?
> @@ -949,6 +952,8 @@ function_and_variable_visibility (bool whole_program)
>  	  node->forced_by_abi = false;
>  	}
>        if (!node->externally_visible
> +	  /* Don't mess with a function local to a comdat group.  */
> +	  && !comdat_local_p (node)
>  	  && node->definition && !node->weakref
>  	  && !DECL_EXTERNAL (node->decl))
>  	{

Probably you want to get make_decl_local to preserve comdat group; then you won't need
to care about it here and clonning could work.

Probably also when declaring a comdat symbol local, we want to turn all associated
comdats to loca, right? (i.e. with LTO and hidden comdat)

Thanks, this looks like interesting stuff!
Honza



More information about the Gcc-patches mailing list