This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Stream ODR types


Hi,
this patch adds computation and streaming of mangled type names.  As suggested by Jason,
it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ supply them.
This makes it possible to stablish precise ODR type equivalency at LTO (till now we can
do that only for complete class types with virtual methods attached to them).
Lto type merging is then updated to register all types into the ODR type hash.  This
makes warnings to be output for ODR violations. Here are ones output for Firefox:
http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt

As discussed earlier, in addition to ODR warnings that seems useful, I would
like to use it for TBAA analysis for ODR types that are not structurally
equivalent to non-ODR types, so C++ programs will get better alias analysis and
for other tricks, such as more agresively merging ODR types.

I believe this makes sense (is orthogonal) with early debug info (for warnings, TBAA
and devirtualization).  It can be also used to more agresively merge debug information
as done by LLVM.

The change increase LTO object fules by about 2% (uncompressed by 6%) and also
increase WPA memory use and streaming times by about same percentage.  It is
not small and thus I made it optional (enabled by default for now).  We could see
how benefits relate to this cost once the other three parts are implemented.

Bootstrapped/regtested x86_64-linux, seems sane?

Honza

	* common.opt (flto-odr-type-merging): New flag.
	* ipa-deivrt.c (hash_type_name): Use ODR names for hasing if availale.
	(types_same_for_odr): Likewise.
	(odr_subtypes_equivalent_p): Likewise.
	(add_type_duplicate): Do not walk type variants.
	(register_odr_type): New function.
	* ipa-utils.h (register_odr_type): Declare.
	(odr_type_p): New function.
	* langhooks.c (lhd_set_decl_assembler_name): Do not compute
	TYPE_DECLs
	* doc/invoke.texi (-flto-odr-type-merging): Document.
	* tree.c (need_assembler_name_p): Compute ODR names when asked
	for it.
	* tree.h (DECL_ASSEMBLER_NAME): Update comment.

	* lto.c (lto_read_decls): Register ODR types.

Index: common.opt
===================================================================
--- common.opt	(revision 215103)
+++ common.opt	(working copy)
@@ -1560,6 +1560,10 @@ flto-compression-level=
 Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1)
 -flto-compression-level=<number>	Use zlib compression level <number> for IL
 
+flto-odr-type-merging
+Common Report Var(flag_lto_odr_type_mering) Init(1)
+Merge C++ types using One Definition Rule
+
 flto-report
 Common Report Var(flag_lto_report) Init(0)
 Report various link-time optimization statistics
Index: ipa-devirt.c
===================================================================
--- ipa-devirt.c	(revision 215103)
+++ ipa-devirt.c	(working copy)
@@ -287,7 +287,13 @@ hash_type_name (tree t)
   if (type_in_anonymous_namespace_p (t))
     return htab_hash_pointer (t);
 
-  /* For polymorphic types, we can simply hash the virtual table.  */
+  /* ODR types have name specified.  */
+  if (TYPE_NAME (t)
+      && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t)))
+    return IDENTIFIER_HASH_VALUE (DECL_ASSEMBLER_NAME (TYPE_NAME (t)));
+
+  /* For polymorphic types that was compiled with -fno-lto-odr-type-merging
+     we can simply hash the virtual table.  */
   if (TREE_CODE (t) == RECORD_TYPE
       && TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)))
     {
@@ -305,8 +311,14 @@ hash_type_name (tree t)
       return hash;
     }
 
-  /* Rest is not implemented yet.  */
-  gcc_unreachable ();
+  /* Builtin types may appear as main variants of ODR types and are unique.
+     Sanity check we do not get anything that looks non-builtin.  */
+  gcc_checking_assert (TREE_CODE (t) == INTEGER_TYPE
+		       || TREE_CODE (t) == VOID_TYPE
+		       || TREE_CODE (t) == COMPLEX_TYPE
+		       || TREE_CODE (t) == REAL_TYPE
+		       || TREE_CODE (t) == POINTER_TYPE);
+  return htab_hash_pointer (t);
 }
 
 /* Return the computed hashcode for ODR_TYPE.  */
@@ -347,42 +359,61 @@ types_same_for_odr (const_tree type1, co
       || type_in_anonymous_namespace_p (type2))
     return false;
 
-  /* See if types are obvoiusly different (i.e. different codes
-     or polymorphis wrt non-polymorphic).  This is not strictly correct
-     for ODR violating programs, but we can't do better without streaming
-     ODR names.  */
-  if (TREE_CODE (type1) != TREE_CODE (type2))
-    return false;
-  if (TREE_CODE (type1) == RECORD_TYPE
-      && (TYPE_BINFO (type1) == NULL_TREE) != (TYPE_BINFO (type1) == NULL_TREE))
-    return false;
-  if (TREE_CODE (type1) == RECORD_TYPE && TYPE_BINFO (type1)
-      && (BINFO_VTABLE (TYPE_BINFO (type1)) == NULL_TREE)
-	 != (BINFO_VTABLE (TYPE_BINFO (type2)) == NULL_TREE))
-    return false;
 
-  /* At the moment we have no way to establish ODR equivlaence at LTO
-     other than comparing virtual table pointrs of polymorphic types.
-     Eventually we should start saving mangled names in TYPE_NAME.
-     Then this condition will become non-trivial.  */
-
-  if (TREE_CODE (type1) == RECORD_TYPE
-      && TYPE_BINFO (type1) && TYPE_BINFO (type2)
-      && BINFO_VTABLE (TYPE_BINFO (type1))
-      && BINFO_VTABLE (TYPE_BINFO (type2)))
-    {
-      tree v1 = BINFO_VTABLE (TYPE_BINFO (type1));
-      tree v2 = BINFO_VTABLE (TYPE_BINFO (type2));
-      gcc_assert (TREE_CODE (v1) == POINTER_PLUS_EXPR
-		  && TREE_CODE (v2) == POINTER_PLUS_EXPR);
-      return (operand_equal_p (TREE_OPERAND (v1, 1),
-			       TREE_OPERAND (v2, 1), 0)
-	      && DECL_ASSEMBLER_NAME
-		     (TREE_OPERAND (TREE_OPERAND (v1, 0), 0))
-		 == DECL_ASSEMBLER_NAME
-		     (TREE_OPERAND (TREE_OPERAND (v2, 0), 0)));
+  /* ODR name of the type is set in DECL_ASSEMBLER_NAME of its TYPE_NAME.
+
+     Ideally we should never meed types without ODR names here.  It can however
+     happen in two cases:
+
+       1) for builtin types that are not streamed but rebuilt in lto/lto-lang.c
+          Here testing for equivalence is safe, since their MAIN_VARIANTs are
+          unique.
+       2) for units streamed with -fno-lto-odr-type-merging.  Here we can't
+	  establish precise ODR equivalency, but for correctness we care only
+	  about equivalency on complete polymorphic types.  For these we can
+	  compare assembler names of their virtual tables.  */
+  if ((!TYPE_NAME (type1) || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type1)))
+      || (!TYPE_NAME (type2) || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type2))))
+    {
+      /* See if types are obvoiusly different (i.e. different codes
+	 or polymorphis wrt non-polymorphic).  This is not strictly correct
+	 for ODR violating programs, but we can't do better without streaming
+	 ODR names.  */
+      if (TREE_CODE (type1) != TREE_CODE (type2))
+	return false;
+      if (TREE_CODE (type1) == RECORD_TYPE
+	  && (TYPE_BINFO (type1) == NULL_TREE) != (TYPE_BINFO (type1) == NULL_TREE))
+	return false;
+      if (TREE_CODE (type1) == RECORD_TYPE && TYPE_BINFO (type1)
+	  && (BINFO_VTABLE (TYPE_BINFO (type1)) == NULL_TREE)
+	     != (BINFO_VTABLE (TYPE_BINFO (type2)) == NULL_TREE))
+	return false;
+
+      /* At the moment we have no way to establish ODR equivlaence at LTO
+	 other than comparing virtual table pointrs of polymorphic types.
+	 Eventually we should start saving mangled names in TYPE_NAME.
+	 Then this condition will become non-trivial.  */
+
+      if (TREE_CODE (type1) == RECORD_TYPE
+	  && TYPE_BINFO (type1) && TYPE_BINFO (type2)
+	  && BINFO_VTABLE (TYPE_BINFO (type1))
+	  && BINFO_VTABLE (TYPE_BINFO (type2)))
+	{
+	  tree v1 = BINFO_VTABLE (TYPE_BINFO (type1));
+	  tree v2 = BINFO_VTABLE (TYPE_BINFO (type2));
+	  gcc_assert (TREE_CODE (v1) == POINTER_PLUS_EXPR
+		      && TREE_CODE (v2) == POINTER_PLUS_EXPR);
+	  return (operand_equal_p (TREE_OPERAND (v1, 1),
+				   TREE_OPERAND (v2, 1), 0)
+		  && DECL_ASSEMBLER_NAME
+			 (TREE_OPERAND (TREE_OPERAND (v1, 0), 0))
+		     == DECL_ASSEMBLER_NAME
+			 (TREE_OPERAND (TREE_OPERAND (v2, 0), 0)));
+	}
+      gcc_unreachable ();
     }
-  gcc_unreachable ();
+  return (DECL_ASSEMBLER_NAME (TYPE_NAME (type1))
+	  == DECL_ASSEMBLER_NAME (TYPE_NAME (type2)));
 }
 
 
@@ -451,12 +482,6 @@ odr_subtypes_equivalent_p (tree t1, tree
   t2 = main_odr_variant (t2);
   if (t1 == t2)
     return true;
-  if (TREE_CODE (t1) != TREE_CODE (t2))
-    return false;
-  if ((TYPE_NAME (t1) == NULL_TREE) != (TYPE_NAME (t2) == NULL_TREE))
-    return false;
-  if (TYPE_NAME (t1) && DECL_NAME (TYPE_NAME (t1)) != DECL_NAME (TYPE_NAME (t2)))
-    return false;
 
   /* Anonymous namespace types must match exactly.  */
   an1 = type_in_anonymous_namespace_p (t1);
@@ -464,13 +489,20 @@ odr_subtypes_equivalent_p (tree t1, tree
   if (an1 != an2 || an1)
     return false;
 
-  /* For types where we can not establish ODR equivalency, recurse and deeply
-     compare.  */
-  if (TREE_CODE (t1) != RECORD_TYPE
-      || !TYPE_BINFO (t1) || !TYPE_BINFO (t2)
-      || !polymorphic_type_binfo_p (TYPE_BINFO (t1))
-      || !polymorphic_type_binfo_p (TYPE_BINFO (t2)))
+  /* For types where we can not establish ODR equivalency (either by ODR names
+     or by virtual tables), recurse and deeply compare.  */
+  if ((!odr_type_p (t1) || !odr_type_p (t2))
+      && (TREE_CODE (t1) != RECORD_TYPE || TREE_CODE (t2) != RECORD_TYPE
+          || !TYPE_BINFO (t1) || !TYPE_BINFO (t2)
+          || !polymorphic_type_binfo_p (TYPE_BINFO (t1))
+          || !polymorphic_type_binfo_p (TYPE_BINFO (t2))))
     {
+      if (TREE_CODE (t1) != TREE_CODE (t2))
+	return false;
+      if ((TYPE_NAME (t1) == NULL_TREE) != (TYPE_NAME (t2) == NULL_TREE))
+	return false;
+      if (TYPE_NAME (t1) && DECL_NAME (TYPE_NAME (t1)) != DECL_NAME (TYPE_NAME (t2)))
+	return false;
       /* This should really be a pair hash, but for the moment we do not need
 	 100% reliability and it would be better to compare all ODR types so
 	 recursion here is needed only for component types.  */
@@ -478,6 +510,7 @@ odr_subtypes_equivalent_p (tree t1, tree
 	return true;
       return odr_types_equivalent_p (t1, t2, false, NULL, visited);
     }
+
   return types_same_for_odr (t1, t2);
 }
 
@@ -1148,8 +1218,14 @@ add_type_duplicate (odr_type val, tree t
 	 to external declarations of methods that may be defined in the
 	 merged LTO unit.  For this reason we absolutely need to remove
 	 them and replace by internal variants. Not doing so will lead
-         to incomplete answers from possible_polymorphic_call_targets.  */
+         to incomplete answers from possible_polymorphic_call_targets.
+
+	 FIXME: disable for now; because ODR types are now build during
+	 streaming in, the variants do not need to be linked to the type,
+	 yet.  We need to do the merging in cleanup pass to be implemented
+	 soon.  */
       if (!flag_ltrans && merge
+	  && 0
 	  && TREE_CODE (val->type) == RECORD_TYPE
 	  && TREE_CODE (type) == RECORD_TYPE
 	  && TYPE_BINFO (val->type) && TYPE_BINFO (type)
@@ -1281,6 +1356,20 @@ get_odr_type (tree type, bool insert)
   return val;
 }
 
+/* Add TYPE od ODR type hash.  */
+
+void
+register_odr_type (tree type)
+{
+  if (!odr_hash)
+    odr_hash = new odr_hash_type (23);
+  /* Arrange things to be nicer and insert main variants first.  */
+  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
+    get_odr_type (TYPE_MAIN_VARIANT (type), true);
+  if (TYPE_MAIN_VARIANT (type) != type)
+    get_odr_type (type, true);
+}
+
 /* Dump ODR type T and all its derrived type.  INDENT specify indentation for
    recusive printing.  */
 
Index: ipa-utils.h
===================================================================
--- ipa-utils.h	(revision 215103)
+++ ipa-utils.h	(working copy)
@@ -152,6 +152,7 @@ tree vtable_pointer_value_to_binfo (cons
 bool vtable_pointer_value_to_vtable (const_tree, tree *, unsigned HOST_WIDE_INT *);
 void compare_virtual_tables (varpool_node *, varpool_node *);
 bool contains_polymorphic_type_p (const_tree);
+void register_odr_type (tree);
 
 /* Return vector containing possible targets of polymorphic call E.
    If FINALP is non-NULL, store true if the list is complette. 
@@ -239,6 +240,23 @@ possible_polymorphic_call_target_p (tree
 					     context,
 					     n);
 }
+
+/* Return true of T is type with One Definition Rule info attached. 
+   It means that either it is anonymous type or it has assembler name
+   set.  */
+
+static inline bool
+odr_type_p (const_tree t)
+{
+  if (type_in_anonymous_namespace_p (t))
+    return true;
+  /* We do not have this information when not in LTO, but we do not need
+     to care, since it is used only for type merging.  */
+  gcc_assert (in_lto_p || flag_lto);
+
+  return (TYPE_NAME (t)
+          && (DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t))));
+}
 #endif  /* GCC_IPA_UTILS_H  */
 
 
Index: lto/lto.c
===================================================================
--- lto/lto.c	(revision 215103)
+++ lto/lto.c	(working copy)
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
 #include "pass_manager.h"
 #include "ipa-inline.h"
 #include "params.h"
+#include "ipa-utils.h"
 
 
 /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver.  */
@@ -1911,7 +1911,11 @@ lto_read_decls (struct lto_file_decl_dat
 	      /* Compute the canonical type of all types.
 		 ???  Should be able to assert that !TYPE_CANONICAL.  */
 	      if (TYPE_P (t) && !TYPE_CANONICAL (t))
-		gimple_register_canonical_type (t);
+		{
+		  gimple_register_canonical_type (t);
+		  if (odr_type_p (t))
+		    register_odr_type (t);
+		}
 	      /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
 		 type which is also member of this SCC.  */
 	      if (TREE_CODE (t) == INTEGER_CST
Index: langhooks.c
===================================================================
--- langhooks.c	(revision 215103)
+++ langhooks.c	(working copy)
@@ -147,6 +147,11 @@ lhd_set_decl_assembler_name (tree decl)
 {
   tree id;
 
+  /* set_decl_assembler_name may be called on TYPE_DECL to record ODR
+     name for C++ types.  By default types have no ODR names.  */
+  if (TREE_CODE (decl) == TYPE_DECL)
+    return;
+
   /* The language-independent code should never use the
      DECL_ASSEMBLER_NAME for lots of DECLs.  Only FUNCTION_DECLs and
      VAR_DECLs for variables with static storage duration need a real
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 215103)
+++ doc/invoke.texi	(working copy)
@@ -8997,6 +8997,12 @@ The value @code{one} specifies that exac
 used while the value @code{none} bypasses partitioning and executes
 the link-time optimization step directly from the WPA phase.
 
+@item -flto-odr-type-merging
+@opindex flto-odr-type-merging
+Enable streaming of mangled types names of C++ types and their unification
+at linktime.  This increases size of LTO object files, but enable
+diagnostics about One Definition Rule violations.
+
 @item -flto-compression-level=@var{n}
 This option specifies the level of compression used for intermediate
 language written to LTO object files, and is only meaningful in
Index: tree.c
===================================================================
--- tree.c	(revision 215103)
+++ tree.c	(working copy)
@@ -4980,6 +4981,15 @@ free_lang_data_in_type (tree type)
 static inline bool
 need_assembler_name_p (tree decl)
 {
+  /* We use DECL_ASSEMBLER_NAME to hold mangled type names for One Definition Rule
+     merging.  */
+  if (flag_lto_odr_type_mering
+      && TREE_CODE (decl) == TYPE_DECL
+      && DECL_NAME (decl)
+      && decl == TYPE_NAME (TREE_TYPE (decl))
+      && !is_lang_specific (TREE_TYPE (decl))
+      && !type_in_anonymous_namespace_p (TREE_TYPE (decl)))
+    return !DECL_ASSEMBLER_NAME_SET_P (decl);
   /* Only FUNCTION_DECLs and VAR_DECLs are considered.  */
   if (TREE_CODE (decl) != FUNCTION_DECL
       && TREE_CODE (decl) != VAR_DECL)
Index: tree.h
===================================================================
--- tree.h	(revision 215103)
+++ tree.h	(working copy)
@@ -2344,7 +2344,11 @@ extern void decl_value_expr_insert (tree
 
 /* The name of the object as the assembler will see it (but before any
    translations made by ASM_OUTPUT_LABELREF).  Often this is the same
-   as DECL_NAME.  It is an IDENTIFIER_NODE.  */
+   as DECL_NAME.  It is an IDENTIFIER_NODE.
+
+   ASSEMBLER_NAME of TYPE_DECLS may store global name of type used for
+   One Definition Rule based type merging at LTO.  It is computed only for
+   LTO compilation and C++.  */
 #define DECL_ASSEMBLER_NAME(NODE) decl_assembler_name (NODE)
 
 /* Return true if NODE is a NODE that can contain a DECL_ASSEMBLER_NAME.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]