This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Stream ODR types


On Thu, 11 Sep 2014, Jan Hubicka wrote:

> Hi,
> this patch adds computation and streaming of mangled type names.  As suggested by Jason,
> it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ supply them.
> This makes it possible to stablish precise ODR type equivalency at LTO (till now we can
> do that only for complete class types with virtual methods attached to them).
> Lto type merging is then updated to register all types into the ODR type hash.  This
> makes warnings to be output for ODR violations. Here are ones output for Firefox:
> http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> 
> As discussed earlier, in addition to ODR warnings that seems useful, I would
> like to use it for TBAA analysis for ODR types that are not structurally
> equivalent to non-ODR types, so C++ programs will get better alias analysis and
> for other tricks, such as more agresively merging ODR types.
> 
> I believe this makes sense (is orthogonal) with early debug info (for warnings, TBAA
> and devirtualization).  It can be also used to more agresively merge debug information
> as done by LLVM.
> 
> The change increase LTO object fules by about 2% (uncompressed by 6%) and also
> increase WPA memory use and streaming times by about same percentage.  It is
> not small and thus I made it optional (enabled by default for now).  We could see
> how benefits relate to this cost once the other three parts are implemented.
> 
> Bootstrapped/regtested x86_64-linux, seems sane?

It looks sane, but when early debug is completed we likely will drop
all the elaborated types from decls.  Thus to keep the ODR type you'd
have to keep (and compute early as well) their DECL_ASSEMBLER_NAME?

Can't we just store a hash of the assembler name?  From alias analysis
perspective false aliasing due to a hash collision is harmless, no?
Maybe not for ODR warnings though.  At least a hash would be way
cheaper than those usually very large strings....

You probably want to restrict ODR types to aggregates?

Richard.

> Honza
> 
> 	* common.opt (flto-odr-type-merging): New flag.
> 	* ipa-deivrt.c (hash_type_name): Use ODR names for hasing if availale.
> 	(types_same_for_odr): Likewise.
> 	(odr_subtypes_equivalent_p): Likewise.
> 	(add_type_duplicate): Do not walk type variants.
> 	(register_odr_type): New function.
> 	* ipa-utils.h (register_odr_type): Declare.
> 	(odr_type_p): New function.
> 	* langhooks.c (lhd_set_decl_assembler_name): Do not compute
> 	TYPE_DECLs
> 	* doc/invoke.texi (-flto-odr-type-merging): Document.
> 	* tree.c (need_assembler_name_p): Compute ODR names when asked
> 	for it.
> 	* tree.h (DECL_ASSEMBLER_NAME): Update comment.
> 
> 	* lto.c (lto_read_decls): Register ODR types.
> 
> Index: common.opt
> ===================================================================
> --- common.opt	(revision 215103)
> +++ common.opt	(working copy)
> @@ -1560,6 +1560,10 @@ flto-compression-level=
>  Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1)
>  -flto-compression-level=<number>	Use zlib compression level <number> for IL
>  
> +flto-odr-type-merging
> +Common Report Var(flag_lto_odr_type_mering) Init(1)
> +Merge C++ types using One Definition Rule
> +
>  flto-report
>  Common Report Var(flag_lto_report) Init(0)
>  Report various link-time optimization statistics
> Index: ipa-devirt.c
> ===================================================================
> --- ipa-devirt.c	(revision 215103)
> +++ ipa-devirt.c	(working copy)
> @@ -287,7 +287,13 @@ hash_type_name (tree t)
>    if (type_in_anonymous_namespace_p (t))
>      return htab_hash_pointer (t);
>  
> -  /* For polymorphic types, we can simply hash the virtual table.  */
> +  /* ODR types have name specified.  */
> +  if (TYPE_NAME (t)
> +      && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t)))
> +    return IDENTIFIER_HASH_VALUE (DECL_ASSEMBLER_NAME (TYPE_NAME (t)));
> +
> +  /* For polymorphic types that was compiled with -fno-lto-odr-type-merging
> +     we can simply hash the virtual table.  */
>    if (TREE_CODE (t) == RECORD_TYPE
>        && TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)))
>      {
> @@ -305,8 +311,14 @@ hash_type_name (tree t)
>        return hash;
>      }
>  
> -  /* Rest is not implemented yet.  */
> -  gcc_unreachable ();
> +  /* Builtin types may appear as main variants of ODR types and are unique.
> +     Sanity check we do not get anything that looks non-builtin.  */
> +  gcc_checking_assert (TREE_CODE (t) == INTEGER_TYPE
> +		       || TREE_CODE (t) == VOID_TYPE
> +		       || TREE_CODE (t) == COMPLEX_TYPE
> +		       || TREE_CODE (t) == REAL_TYPE
> +		       || TREE_CODE (t) == POINTER_TYPE);
> +  return htab_hash_pointer (t);
>  }
>  
>  /* Return the computed hashcode for ODR_TYPE.  */
> @@ -347,42 +359,61 @@ types_same_for_odr (const_tree type1, co
>        || type_in_anonymous_namespace_p (type2))
>      return false;
>  
> -  /* See if types are obvoiusly different (i.e. different codes
> -     or polymorphis wrt non-polymorphic).  This is not strictly correct
> -     for ODR violating programs, but we can't do better without streaming
> -     ODR names.  */
> -  if (TREE_CODE (type1) != TREE_CODE (type2))
> -    return false;
> -  if (TREE_CODE (type1) == RECORD_TYPE
> -      && (TYPE_BINFO (type1) == NULL_TREE) != (TYPE_BINFO (type1) == NULL_TREE))
> -    return false;
> -  if (TREE_CODE (type1) == RECORD_TYPE && TYPE_BINFO (type1)
> -      && (BINFO_VTABLE (TYPE_BINFO (type1)) == NULL_TREE)
> -	 != (BINFO_VTABLE (TYPE_BINFO (type2)) == NULL_TREE))
> -    return false;
>  
> -  /* At the moment we have no way to establish ODR equivlaence at LTO
> -     other than comparing virtual table pointrs of polymorphic types.
> -     Eventually we should start saving mangled names in TYPE_NAME.
> -     Then this condition will become non-trivial.  */
> -
> -  if (TREE_CODE (type1) == RECORD_TYPE
> -      && TYPE_BINFO (type1) && TYPE_BINFO (type2)
> -      && BINFO_VTABLE (TYPE_BINFO (type1))
> -      && BINFO_VTABLE (TYPE_BINFO (type2)))
> -    {
> -      tree v1 = BINFO_VTABLE (TYPE_BINFO (type1));
> -      tree v2 = BINFO_VTABLE (TYPE_BINFO (type2));
> -      gcc_assert (TREE_CODE (v1) == POINTER_PLUS_EXPR
> -		  && TREE_CODE (v2) == POINTER_PLUS_EXPR);
> -      return (operand_equal_p (TREE_OPERAND (v1, 1),
> -			       TREE_OPERAND (v2, 1), 0)
> -	      && DECL_ASSEMBLER_NAME
> -		     (TREE_OPERAND (TREE_OPERAND (v1, 0), 0))
> -		 == DECL_ASSEMBLER_NAME
> -		     (TREE_OPERAND (TREE_OPERAND (v2, 0), 0)));
> +  /* ODR name of the type is set in DECL_ASSEMBLER_NAME of its TYPE_NAME.
> +
> +     Ideally we should never meed types without ODR names here.  It can however
> +     happen in two cases:
> +
> +       1) for builtin types that are not streamed but rebuilt in lto/lto-lang.c
> +          Here testing for equivalence is safe, since their MAIN_VARIANTs are
> +          unique.
> +       2) for units streamed with -fno-lto-odr-type-merging.  Here we can't
> +	  establish precise ODR equivalency, but for correctness we care only
> +	  about equivalency on complete polymorphic types.  For these we can
> +	  compare assembler names of their virtual tables.  */
> +  if ((!TYPE_NAME (type1) || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type1)))
> +      || (!TYPE_NAME (type2) || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type2))))
> +    {
> +      /* See if types are obvoiusly different (i.e. different codes
> +	 or polymorphis wrt non-polymorphic).  This is not strictly correct
> +	 for ODR violating programs, but we can't do better without streaming
> +	 ODR names.  */
> +      if (TREE_CODE (type1) != TREE_CODE (type2))
> +	return false;
> +      if (TREE_CODE (type1) == RECORD_TYPE
> +	  && (TYPE_BINFO (type1) == NULL_TREE) != (TYPE_BINFO (type1) == NULL_TREE))
> +	return false;
> +      if (TREE_CODE (type1) == RECORD_TYPE && TYPE_BINFO (type1)
> +	  && (BINFO_VTABLE (TYPE_BINFO (type1)) == NULL_TREE)
> +	     != (BINFO_VTABLE (TYPE_BINFO (type2)) == NULL_TREE))
> +	return false;
> +
> +      /* At the moment we have no way to establish ODR equivlaence at LTO
> +	 other than comparing virtual table pointrs of polymorphic types.
> +	 Eventually we should start saving mangled names in TYPE_NAME.
> +	 Then this condition will become non-trivial.  */
> +
> +      if (TREE_CODE (type1) == RECORD_TYPE
> +	  && TYPE_BINFO (type1) && TYPE_BINFO (type2)
> +	  && BINFO_VTABLE (TYPE_BINFO (type1))
> +	  && BINFO_VTABLE (TYPE_BINFO (type2)))
> +	{
> +	  tree v1 = BINFO_VTABLE (TYPE_BINFO (type1));
> +	  tree v2 = BINFO_VTABLE (TYPE_BINFO (type2));
> +	  gcc_assert (TREE_CODE (v1) == POINTER_PLUS_EXPR
> +		      && TREE_CODE (v2) == POINTER_PLUS_EXPR);
> +	  return (operand_equal_p (TREE_OPERAND (v1, 1),
> +				   TREE_OPERAND (v2, 1), 0)
> +		  && DECL_ASSEMBLER_NAME
> +			 (TREE_OPERAND (TREE_OPERAND (v1, 0), 0))
> +		     == DECL_ASSEMBLER_NAME
> +			 (TREE_OPERAND (TREE_OPERAND (v2, 0), 0)));
> +	}
> +      gcc_unreachable ();
>      }
> -  gcc_unreachable ();
> +  return (DECL_ASSEMBLER_NAME (TYPE_NAME (type1))
> +	  == DECL_ASSEMBLER_NAME (TYPE_NAME (type2)));
>  }
>  
>  
> @@ -451,12 +482,6 @@ odr_subtypes_equivalent_p (tree t1, tree
>    t2 = main_odr_variant (t2);
>    if (t1 == t2)
>      return true;
> -  if (TREE_CODE (t1) != TREE_CODE (t2))
> -    return false;
> -  if ((TYPE_NAME (t1) == NULL_TREE) != (TYPE_NAME (t2) == NULL_TREE))
> -    return false;
> -  if (TYPE_NAME (t1) && DECL_NAME (TYPE_NAME (t1)) != DECL_NAME (TYPE_NAME (t2)))
> -    return false;
>  
>    /* Anonymous namespace types must match exactly.  */
>    an1 = type_in_anonymous_namespace_p (t1);
> @@ -464,13 +489,20 @@ odr_subtypes_equivalent_p (tree t1, tree
>    if (an1 != an2 || an1)
>      return false;
>  
> -  /* For types where we can not establish ODR equivalency, recurse and deeply
> -     compare.  */
> -  if (TREE_CODE (t1) != RECORD_TYPE
> -      || !TYPE_BINFO (t1) || !TYPE_BINFO (t2)
> -      || !polymorphic_type_binfo_p (TYPE_BINFO (t1))
> -      || !polymorphic_type_binfo_p (TYPE_BINFO (t2)))
> +  /* For types where we can not establish ODR equivalency (either by ODR names
> +     or by virtual tables), recurse and deeply compare.  */
> +  if ((!odr_type_p (t1) || !odr_type_p (t2))
> +      && (TREE_CODE (t1) != RECORD_TYPE || TREE_CODE (t2) != RECORD_TYPE
> +          || !TYPE_BINFO (t1) || !TYPE_BINFO (t2)
> +          || !polymorphic_type_binfo_p (TYPE_BINFO (t1))
> +          || !polymorphic_type_binfo_p (TYPE_BINFO (t2))))
>      {
> +      if (TREE_CODE (t1) != TREE_CODE (t2))
> +	return false;
> +      if ((TYPE_NAME (t1) == NULL_TREE) != (TYPE_NAME (t2) == NULL_TREE))
> +	return false;
> +      if (TYPE_NAME (t1) && DECL_NAME (TYPE_NAME (t1)) != DECL_NAME (TYPE_NAME (t2)))
> +	return false;
>        /* This should really be a pair hash, but for the moment we do not need
>  	 100% reliability and it would be better to compare all ODR types so
>  	 recursion here is needed only for component types.  */
> @@ -478,6 +510,7 @@ odr_subtypes_equivalent_p (tree t1, tree
>  	return true;
>        return odr_types_equivalent_p (t1, t2, false, NULL, visited);
>      }
> +
>    return types_same_for_odr (t1, t2);
>  }
>  
> @@ -1148,8 +1218,14 @@ add_type_duplicate (odr_type val, tree t
>  	 to external declarations of methods that may be defined in the
>  	 merged LTO unit.  For this reason we absolutely need to remove
>  	 them and replace by internal variants. Not doing so will lead
> -         to incomplete answers from possible_polymorphic_call_targets.  */
> +         to incomplete answers from possible_polymorphic_call_targets.
> +
> +	 FIXME: disable for now; because ODR types are now build during
> +	 streaming in, the variants do not need to be linked to the type,
> +	 yet.  We need to do the merging in cleanup pass to be implemented
> +	 soon.  */
>        if (!flag_ltrans && merge
> +	  && 0
>  	  && TREE_CODE (val->type) == RECORD_TYPE
>  	  && TREE_CODE (type) == RECORD_TYPE
>  	  && TYPE_BINFO (val->type) && TYPE_BINFO (type)
> @@ -1281,6 +1356,20 @@ get_odr_type (tree type, bool insert)
>    return val;
>  }
>  
> +/* Add TYPE od ODR type hash.  */
> +
> +void
> +register_odr_type (tree type)
> +{
> +  if (!odr_hash)
> +    odr_hash = new odr_hash_type (23);
> +  /* Arrange things to be nicer and insert main variants first.  */
> +  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
> +    get_odr_type (TYPE_MAIN_VARIANT (type), true);
> +  if (TYPE_MAIN_VARIANT (type) != type)
> +    get_odr_type (type, true);
> +}
> +
>  /* Dump ODR type T and all its derrived type.  INDENT specify indentation for
>     recusive printing.  */
>  
> Index: ipa-utils.h
> ===================================================================
> --- ipa-utils.h	(revision 215103)
> +++ ipa-utils.h	(working copy)
> @@ -152,6 +152,7 @@ tree vtable_pointer_value_to_binfo (cons
>  bool vtable_pointer_value_to_vtable (const_tree, tree *, unsigned HOST_WIDE_INT *);
>  void compare_virtual_tables (varpool_node *, varpool_node *);
>  bool contains_polymorphic_type_p (const_tree);
> +void register_odr_type (tree);
>  
>  /* Return vector containing possible targets of polymorphic call E.
>     If FINALP is non-NULL, store true if the list is complette. 
> @@ -239,6 +240,23 @@ possible_polymorphic_call_target_p (tree
>  					     context,
>  					     n);
>  }
> +
> +/* Return true of T is type with One Definition Rule info attached. 
> +   It means that either it is anonymous type or it has assembler name
> +   set.  */
> +
> +static inline bool
> +odr_type_p (const_tree t)
> +{
> +  if (type_in_anonymous_namespace_p (t))
> +    return true;
> +  /* We do not have this information when not in LTO, but we do not need
> +     to care, since it is used only for type merging.  */
> +  gcc_assert (in_lto_p || flag_lto);
> +
> +  return (TYPE_NAME (t)
> +          && (DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t))));
> +}
>  #endif  /* GCC_IPA_UTILS_H  */
>  
>  
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c	(revision 215103)
> +++ lto/lto.c	(working copy)
> @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
>  #include "pass_manager.h"
>  #include "ipa-inline.h"
>  #include "params.h"
> +#include "ipa-utils.h"
>  
>  
>  /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver.  */
> @@ -1911,7 +1911,11 @@ lto_read_decls (struct lto_file_decl_dat
>  	      /* Compute the canonical type of all types.
>  		 ???  Should be able to assert that !TYPE_CANONICAL.  */
>  	      if (TYPE_P (t) && !TYPE_CANONICAL (t))
> -		gimple_register_canonical_type (t);
> +		{
> +		  gimple_register_canonical_type (t);
> +		  if (odr_type_p (t))
> +		    register_odr_type (t);
> +		}
>  	      /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
>  		 type which is also member of this SCC.  */
>  	      if (TREE_CODE (t) == INTEGER_CST
> Index: langhooks.c
> ===================================================================
> --- langhooks.c	(revision 215103)
> +++ langhooks.c	(working copy)
> @@ -147,6 +147,11 @@ lhd_set_decl_assembler_name (tree decl)
>  {
>    tree id;
>  
> +  /* set_decl_assembler_name may be called on TYPE_DECL to record ODR
> +     name for C++ types.  By default types have no ODR names.  */
> +  if (TREE_CODE (decl) == TYPE_DECL)
> +    return;
> +
>    /* The language-independent code should never use the
>       DECL_ASSEMBLER_NAME for lots of DECLs.  Only FUNCTION_DECLs and
>       VAR_DECLs for variables with static storage duration need a real
> Index: doc/invoke.texi
> ===================================================================
> --- doc/invoke.texi	(revision 215103)
> +++ doc/invoke.texi	(working copy)
> @@ -8997,6 +8997,12 @@ The value @code{one} specifies that exac
>  used while the value @code{none} bypasses partitioning and executes
>  the link-time optimization step directly from the WPA phase.
>  
> +@item -flto-odr-type-merging
> +@opindex flto-odr-type-merging
> +Enable streaming of mangled types names of C++ types and their unification
> +at linktime.  This increases size of LTO object files, but enable
> +diagnostics about One Definition Rule violations.
> +
>  @item -flto-compression-level=@var{n}
>  This option specifies the level of compression used for intermediate
>  language written to LTO object files, and is only meaningful in
> Index: tree.c
> ===================================================================
> --- tree.c	(revision 215103)
> +++ tree.c	(working copy)
> @@ -4980,6 +4981,15 @@ free_lang_data_in_type (tree type)
>  static inline bool
>  need_assembler_name_p (tree decl)
>  {
> +  /* We use DECL_ASSEMBLER_NAME to hold mangled type names for One Definition Rule
> +     merging.  */
> +  if (flag_lto_odr_type_mering
> +      && TREE_CODE (decl) == TYPE_DECL
> +      && DECL_NAME (decl)
> +      && decl == TYPE_NAME (TREE_TYPE (decl))
> +      && !is_lang_specific (TREE_TYPE (decl))
> +      && !type_in_anonymous_namespace_p (TREE_TYPE (decl)))
> +    return !DECL_ASSEMBLER_NAME_SET_P (decl);
>    /* Only FUNCTION_DECLs and VAR_DECLs are considered.  */
>    if (TREE_CODE (decl) != FUNCTION_DECL
>        && TREE_CODE (decl) != VAR_DECL)
> Index: tree.h
> ===================================================================
> --- tree.h	(revision 215103)
> +++ tree.h	(working copy)
> @@ -2344,7 +2344,11 @@ extern void decl_value_expr_insert (tree
>  
>  /* The name of the object as the assembler will see it (but before any
>     translations made by ASM_OUTPUT_LABELREF).  Often this is the same
> -   as DECL_NAME.  It is an IDENTIFIER_NODE.  */
> +   as DECL_NAME.  It is an IDENTIFIER_NODE.
> +
> +   ASSEMBLER_NAME of TYPE_DECLS may store global name of type used for
> +   One Definition Rule based type merging at LTO.  It is computed only for
> +   LTO compilation and C++.  */
>  #define DECL_ASSEMBLER_NAME(NODE) decl_assembler_name (NODE)
>  
>  /* Return true if NODE is a NODE that can contain a DECL_ASSEMBLER_NAME.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]