This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: variant and ODR based type merging during LTO streaming


On Mon, 1 Oct 2018, Jan Hubicka wrote:

> > 
> > The ODR savings really look good but as you say the implementation is
> > somewhat "tricky".
> > 
> > I'd like to see the type-variant done separately (of course) and
> > also differently.  This is because when enabling 
> > free-lang-data by default we should be able to get benefits for
> > non-LTO compilations as well.  Specifically I'd like us to
> > (in free-lang-data) "canonicalize" existing variants applying the
> > rules you derived for the lazier compare.  At the point you then
> > drop non-fld-walked variants we could weed out the resulting
> > duplicates, keeping only a prevailing one in the list and (ab-)using
> > GC to fixup references.  Like with ggc_register_fixup (void *from, void 
> > *to) which would when following edges at mark time, adjust them
> > according to this map (I'm not sure if we'd want to do it manually
> > and record a vector of possibly affected TREE_TYPE uses during the fld
> > walk).  We could also (easily?) optimize the streaming itself by
> > streaming only differences to the main variant of a type (but then
> > we could change our tree data structure to make that trivial).
> 
> I had patch for streaming the differences only some time ago. My recollection
> is that we got it into tree and then reverted as there was some issues with
> cycles.  I can look into this again.
> 
> I would really like to avoid using ggc for IL rewriting. One reason is
> that eventually we would like to see GGC to go and thus we should not
> wire it more into the essential parts of the compiler and other is that
> GGC is often not run. 

Heh.

> Saving type variants seems to have relatively minor effect on the IL size, so
> we need to have solution that is not too expensive to be justified.  I suppose
> free lang data is sort of visiting all the relevant datastructures for
> middle-end so we could do rewriting there.  It would also make sense to
> canonicalize types more based on knowledge whether they are used for memory
> access (i.e. for non-accesses go to main variant and perhaps invent something
> like main variant WRT useless conversions).

I guess for that it would be better to put things like memory access
alignment into the memory-references rather than using TYPE_ALIGN.
Likewise for other things affecting semantics (TYPE_REF_CAN_ALIAS_ALL).

Being able to simply drop to TYPE_MAIN_VARIANT would be very appealing...
(and simplify things).  The hard thing is to figure out where we look
into those variant differences during late compilation...

Type variants was always my first "easy" middle-end type related cleanup.
And for non-LTO ODR merging probably gets us nothing (the FE should do
the merging).

> We could ggc_free the old variant and then ggc will ICE on dangling pointers.
> I can give it a try with my patch to prune the variant tree.

Yes, definitely do that.  I also _really_ like us to do FLD 
unconditionally - maybe we can start by guarding individual pieces with
a for_lto flag we pass down.  But esp. disabling langhooks and stuff like
the variant type purging would be good to get enabled for non-LTO
to not have that modes diverge more and more...

> > 
> > I wonder how much "ODR" merging we'd get for free when we canonicalize
> > types some more - that is, how do ODR equal but tree unequal types
> > usually differ?  I guess it might be most of the time fields with
> > pointer to incomplete vs. complete type?
> 
> Most of the time it is complete vs incomplete pointer type.
> I had statistics of type duplicates before, but they did not look as bad
> as reality. The reason is that smaller types tends to be merged while
> bigger types tends to be duplicated many times.  In GCC this is typically
> RTL, gimple and derived types which are really many.

I see.  So one possible canonicalization is to make _all_
pointer-typed FIELD_DECLs point to incomplete variants since the memory
accesses should already have the "proper" access types.  Can you
get statistics on that?  Not sure how to get an "incomplete" type
though (iff we can simply copy the type and NULL TYPE_FIELDs and
TYPE_SIZE and friends) - again I'd do that at FLD time.

Richard.

> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > 
> > > Honza
> > > 
> > > Index: ipa-devirt.c
> > > ===================================================================
> > > --- ipa-devirt.c	(revision 264689)
> > > +++ ipa-devirt.c	(working copy)
> > > @@ -2111,6 +2111,29 @@ get_odr_type (tree type, bool insert)
> > >    return val;
> > >  }
> > >  
> > > +/* Return the main variant of the odr type.  This is used for straming out
> > > +   to reduce number of type duplicates hitting the WPA->LTRANS streams.
> > > +   Do not do so when ODR violation was detected since the type may be
> > > +   structurally different then.  */
> > > +
> > > +tree
> > > +prevailing_odr_type (tree t)
> > > +{
> > > +  t = TYPE_MAIN_VARIANT (t);
> > > +  /* In need_assembler_name_p we also mangle assembler names of INTEGER_TYPE.
> > > +     We can not merge these because this does not honnor precision and
> > > +     signedness.  */
> > > +  if (!type_with_linkage_p (t)
> > > +      || type_in_anonymous_namespace_p (t)
> > > +      || TREE_CODE (t) == INTEGER_TYPE
> > > +      || !COMPLETE_TYPE_P (t))
> > > +    return t;
> > > +  odr_type ot = get_odr_type (t, true);
> > > +  if (!ot || !ot->odr_violated)
> > > +    return ot->type;
> > > +  return t;
> > > +}
> > > +
> > >  /* Add TYPE od ODR type hash.  */
> > >  
> > >  void
> > > Index: ipa-utils.h
> > > ===================================================================
> > > --- ipa-utils.h	(revision 264689)
> > > +++ ipa-utils.h	(working copy)
> > > @@ -90,6 +90,7 @@ void warn_types_mismatch (tree t1, tree
> > >  			  location_t loc2 = UNKNOWN_LOCATION);
> > >  bool odr_or_derived_type_p (const_tree t);
> > >  bool odr_types_equivalent_p (tree type1, tree type2);
> > > +tree prevailing_odr_type (tree t);
> > >  
> > >  /* Return vector containing possible targets of polymorphic call E.
> > >     If COMPLETEP is non-NULL, store true if the list is complete. 
> > > Index: lto/lto.c
> > > ===================================================================
> > > --- lto/lto.c	(revision 264689)
> > > +++ lto/lto.c	(working copy)
> > > @@ -485,6 +485,8 @@ gimple_register_canonical_type_1 (tree t
> > >  static void
> > >  gimple_register_canonical_type (tree t)
> > >  {
> > > +  if (flag_checking)
> > > +    verify_type (t);
> > >    if (TYPE_CANONICAL (t) || !type_with_alias_set_p (t)
> > >        || !canonical_type_used_p (t))
> > >      return;
> > > Index: lto-streamer-out.c
> > > ===================================================================
> > > --- lto-streamer-out.c	(revision 264689)
> > > +++ lto-streamer-out.c	(working copy)
> > > @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.
> > >  #include "debug.h"
> > >  #include "omp-offload.h"
> > >  #include "print-tree.h"
> > > +#include "attribs.h"
> > > +#include "ipa-utils.h"
> > >  
> > >  
> > >  static void lto_write_tree (struct output_block*, tree, bool);
> > > @@ -109,14 +111,188 @@ destroy_output_block (struct output_bloc
> > >    free (ob);
> > >  }
> > >  
> > > +/* Do same comparsion as check_qualified_type skipping lang part of type
> > > +   and be more permissive about type names: we only care that names are
> > > +   same (for diagnostics) and that ODR names are the same.  */
> > >  
> > > -/* Look up NODE in the type table and write the index for it to OB.  */
> > > +static bool
> > > +types_equal_p (tree t, tree v)
> > > +{
> > > +  if (TYPE_QUALS (t) != TYPE_QUALS (v))
> > > +    return false;
> > > +
> > > +  if (TYPE_NAME (t) != TYPE_NAME (v)
> > > +      && (!TYPE_NAME (t) || !TYPE_NAME (v)
> > > +	  || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL
> > > +	  || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL
> > > +	  || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t))
> > > +	     != DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v))
> > > +	  || DECL_NAME (TYPE_NAME (t)) != DECL_NAME (TYPE_NAME (v))))
> > > +     return false;
> > > +
> > > +  if (TYPE_ALIGN (t) != TYPE_ALIGN (v))
> > > +    return false;
> > > +
> > > +  if (!attribute_list_equal (TYPE_ATTRIBUTES (t),
> > > +			     TYPE_ATTRIBUTES (v)))
> > > +     return false;
> > > +
> > > +  /* Do not replace complete type by incomplete.  */
> > > +  if (COMPLETE_TYPE_P (t) != COMPLETE_TYPE_P (v))
> > > +    return false;
> > > +
> > > +  if (TYPE_ADDR_SPACE (t) != TYPE_ADDR_SPACE (v))
> > > +    return false;
> > > +
> > > +  gcc_assert (TREE_CODE (t) == TREE_CODE (v));
> > > +
> > > +  /* For pointer types and array types we also care about the type they
> > > +     reffer to.  */
> > > +  if (TREE_TYPE (t))
> > > +    return types_equal_p (TREE_TYPE (t), TREE_TYPE (v));
> > > +
> > > +  return true;
> > > +}
> > > +
> > > +/* Search list of type variants and look up first one that looks same.
> > > +   At compile time, this removes duplicates of types created by front-ends.
> > > +   At WPA time we also merge duplicated ODR types.  */
> > > +
> > > +static tree
> > > +first_compatible_type_variant (tree t)
> > > +{
> > > +  tree first = NULL;
> > > +  if (POINTER_TYPE_P (t))
> > > +    {
> > > +      tree t2 = first_compatible_type_variant (TREE_TYPE (t));
> > > +      if (t2 != TREE_TYPE (t))
> > > +	{
> > > +	  if (TREE_CODE (t) == POINTER_TYPE)
> > > +	    first = build_pointer_type_for_mode (t2, TYPE_MODE (t),
> > > +						TYPE_REF_CAN_ALIAS_ALL (t));
> > > +	  else
> > > +	    first = build_reference_type_for_mode (t2, TYPE_MODE (t),
> > > +						TYPE_REF_CAN_ALIAS_ALL (t));
> > > +	}
> > > +      else
> > > +	first = TYPE_MAIN_VARIANT (t);
> > > +      gcc_assert (TYPE_MAIN_VARIANT (first) == first);
> > > +    }
> > > +  else
> > > +    first = prevailing_odr_type (t);
> > > +
> > > +  gcc_checking_assert (gimple_canonical_types_compatible_p (t, first, false));
> > > +
> > > +  for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
> > > +    if (types_equal_p (t, v))
> > > +      {
> > > +	gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, false));
> > > +	gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == TYPE_CANONICAL (v));
> > > +	gcc_checking_assert (get_alias_set (t) == get_alias_set (v));
> > > +        return v;
> > > +      }
> > > +
> > > +  /* Most of the time we will find v itself in the list.  With ODR type merging
> > > +     we however merge across TYPE_MAIN_VARIANTs and in that case we may not
> > > +     have corresponding type in the list.  */
> > > +  tree v = build_variant_type_copy (first);
> > > +  TYPE_READONLY (v) = TYPE_READONLY (t);
> > > +  TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
> > > +  TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
> > > +  TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
> > > +  TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
> > > +  SET_TYPE_ALIGN (v, TYPE_ALIGN (t));
> > > +  TYPE_NAME (v) = TYPE_NAME (t);
> > > +  TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
> > > +  TREE_TYPE (v) = TREE_TYPE (t);
> > > +  gcc_checking_assert (types_equal_p (t, v));
> > > +  gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, false));
> > > +  gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == TYPE_CANONICAL (v));
> > > +  gcc_checking_assert (get_alias_set (t) == get_alias_set (v));
> > > +  return v;
> > > +}
> > > +
> > > +
> > > +/* Look up NODE in the type table and write the index for it to OB.
> > > +   Eliminate unnecesary type variants.  */
> > >  
> > >  static void
> > >  output_type_ref (struct output_block *ob, tree node)
> > >  {
> > > +  bool existed_p;
> > > +  unsigned int &index
> > > +    = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table->
> > > +       get_or_insert (node, &existed_p);
> > >    streamer_write_record_start (ob, LTO_type_ref);
> > > -  lto_output_type_ref_index (ob->decl_state, ob->main_stream, node);
> > > +
> > > +  if (existed_p)
> > > +    {
> > > +      streamer_write_uhwi_stream (ob->main_stream, index);
> > > +      return;
> > > +    }
> > > +
> > > +  tree node2 = first_compatible_type_variant (node);
> > > +
> > > +  /* If NODE is first variant, just add it into encoder.  */
> > > +  if (node2 == node)
> > > +    {
> > > +      index = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length ();
> > > +      if (streamer_dump_file)
> > > +	{
> > > +	  print_node_brief (streamer_dump_file,
> > > +			    "    Encoding indexable ", node, 4);
> > > +	  fprintf (streamer_dump_file, "  as %i\n", index);
> > > +	}
> > > +      ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push (node);
> > > +      streamer_write_uhwi_stream (ob->main_stream, index);
> > > +      return;
> > > +    }
> > > +
> > > +  index = 0xdead;
> > > +
> > > +  if (streamer_dump_file)
> > > +    {
> > > +      print_node_brief (streamer_dump_file, "    Type ", node, 4);
> > > +      fprintf (streamer_dump_file, "  prevailed by ");
> > > +      print_node_brief (streamer_dump_file, "", node2, 4);
> > > +      fprintf (streamer_dump_file, "\n");
> > > +    }
> > > +
> > > +  /* See if we already assgined one to the first variant.  If that is the case
> > > +     only duplicate the entry in the hashtable so we do not need to call
> > > +     first_compatible_type_variant redundantly.  */
> > > +  unsigned int &index2
> > > +    = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table->
> > > +	get_or_insert (node2, &existed_p);
> > > +  /* The reference may be changed by hash table expansion.  */
> > > +  unsigned int i = index2;
> > > +
> > > +  if (existed_p)
> > > +    {
> > > +      ob->decl_state->streams[LTO_DECL_STREAM_TYPE].
> > > +	tree_hash_table->get_or_insert (node, &existed_p) = i;
> > > +      streamer_write_uhwi_stream (ob->main_stream, i);
> > > +      return;
> > > +    }
> > > +  else
> > > +    {
> > > +      index2 = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length ();
> > > +      i = index2;
> > > +      if (streamer_dump_file)
> > > +	{
> > > +	  print_node_brief (streamer_dump_file,
> > > +			    "    Encoding indexable ", node2, 4);
> > > +	  fprintf (streamer_dump_file, "  as %i \n", i);
> > > +	}
> > > +      ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push (node2);
> > > +      /* We need to search again to watch hash table resize.  */
> > > +      ob->decl_state->streams[LTO_DECL_STREAM_TYPE].
> > > +	tree_hash_table->get_or_insert (node, &existed_p) = i;
> > > +      gcc_assert (existed_p);
> > > +      streamer_write_uhwi_stream (ob->main_stream, i);
> > > +    }
> > > +  return;
> > >  }
> > >  
> > >  
> > > @@ -978,12 +1171,15 @@ hash_tree (struct streamer_tree_cache_d
> > >  #define visit(SIBLING) \
> > >    do { \
> > >      unsigned ix; \
> > > -    if (!SIBLING) \
> > > +    tree t2 = SIBLING; \
> > > +    if (t2 && TYPE_P (t2)) \
> > > +      t2 = first_compatible_type_variant (t2); \
> > > +    if (!t2) \
> > >        hstate.add_int (0); \
> > > -    else if (streamer_tree_cache_lookup (cache, SIBLING, &ix)) \
> > > +    else if (streamer_tree_cache_lookup (cache, t2, &ix)) \
> > >        hstate.add_int (streamer_tree_cache_get_hash (cache, ix)); \
> > >      else if (map) \
> > > -      hstate.add_int (*map->get (SIBLING)); \
> > > +      hstate.add_int (*map->get (t2)); \
> > >      else \
> > >        hstate.add_int (1); \
> > >    } while (0)
> > > @@ -1554,6 +1750,10 @@ DFS::DFS_write_tree (struct output_block
> > >    if (this_ref_p && tree_is_indexable (expr))
> > >      return;
> > >  
> > > +  /* Replace type by its prevailing variant.  */
> > > +  if (TYPE_P (expr))
> > > +    expr = first_compatible_type_variant (expr);
> > > +
> > >    /* Check if we already streamed EXPR.  */
> > >    if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL))
> > >      return;
> > > @@ -1591,6 +1791,11 @@ lto_output_tree (struct output_block *ob
> > >        return;
> > >      }
> > >  
> > > +  if (TYPE_P (expr))
> > > +    expr = first_compatible_type_variant (expr);
> > > +
> > > +  gcc_checking_assert (!this_ref_p || !tree_is_indexable (expr));
> > > +
> > >    existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, &ix);
> > >    if (existed_p)
> > >      {
> > > @@ -2334,7 +2539,18 @@ copy_function_or_variable (struct symtab
> > >        gcc_assert (lto_tree_ref_encoder_size (encoder) == 0);
> > >        encoder->trees.reserve_exact (n);
> > >        for (j = 0; j < n; j++)
> > > -	encoder->trees.safe_push ((*trees)[j]);
> > > +	{
> > > +	  tree t = (*trees)[j];
> > > +	  if (TYPE_P (t))
> > > +	    t = first_compatible_type_variant (t);
> > > +	  if (streamer_dump_file)
> > > +	    {
> > > +	      print_node_brief (streamer_dump_file,
> > > +				"  Adding reference to ", t, 4);
> > > +	      fprintf (streamer_dump_file, "  as %i \n", (int)j);
> > > +	    }
> > > +	  encoder->trees.safe_push (t);
> > > +	}
> > >      }
> > >  
> > >    lto_free_raw_section_data (file_data, LTO_section_function_body, name,
> > > @@ -2507,6 +2723,8 @@ write_global_stream (struct output_block
> > >  	  print_node_brief (streamer_dump_file, "", t, 4);
> > >            fprintf (streamer_dump_file, "\n");
> > >  	}
> > > +      gcc_checking_assert (!TYPE_P (t)
> > > +			   || t == first_compatible_type_variant (t));
> > >        if (!streamer_tree_cache_lookup (ob->writer_cache, t, NULL))
> > >  	stream_write_tree (ob, t, false);
> > >      }
> > > @@ -2537,6 +2755,8 @@ write_global_references (struct output_b
> > >        t = lto_tree_ref_encoder_get_tree (encoder, index);
> > >        streamer_tree_cache_lookup (ob->writer_cache, t, &slot_num);
> > >        gcc_assert (slot_num != (unsigned)-1);
> > > +      gcc_checking_assert (!TYPE_P (t)
> > > +			   || t == first_compatible_type_variant (t));
> > >        data[index + 1] = slot_num;
> > >      }
> > >  
> > > Index: tree-ssa-alias.c
> > > ===================================================================
> > > --- tree-ssa-alias.c	(revision 264689)
> > > +++ tree-ssa-alias.c	(working copy)
> > > @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.
> > >  #include "tree-dfa.h"
> > >  #include "ipa-reference.h"
> > >  #include "varasm.h"
> > > +#include "ipa-utils.h"
> > >  
> > >  /* Broad overview of how alias analysis on gimple works:
> > >  
> > > @@ -955,6 +956,12 @@ nonoverlapping_component_refs_of_decl_p
> > >  	     ??? Bitfields can overlap at RTL level so punt on them.  */
> > >  	  if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
> > >  	    return false;
> > > +	  /* In LTO we may end up merging different variants of ODR types
> > > +	     but keep corresponding field decls unmerged.  */
> > > +	  if (in_lto_p
> > > +	      && type_with_linkage_p (type1)
> > > +	      && DECL_NAME (field1) == DECL_NAME (field2))
> > > +	    return false;
> > >  	  return true;
> > >  	}
> > >      }
> > > @@ -1049,7 +1056,9 @@ nonoverlapping_component_refs_p (const_t
> > >  	{
> > >  	  /* We're left with accessing different fields of a structure,
> > >  	     no possible overlap.  */
> > > -	  if (fieldx != fieldy)
> > > +	  if (fieldx != fieldy
> > > +	      && (in_lto_p || !type_with_linkage_p (typex)
> > > +		  || DECL_NAME (fieldx) == DECL_NAME (fieldy)))
> > >  	    {
> > >  	      /* A field and its representative need to be considered the
> > >  		 same.  */
> > > Index: tree.c
> > > ===================================================================
> > > --- tree.c	(revision 264689)
> > > +++ tree.c	(working copy)
> > > @@ -5845,7 +5861,12 @@ free_lang_data_in_cgraph (void)
> > >  
> > >    /* Traverse every type found freeing its language data.  */
> > >    FOR_EACH_VEC_ELT (fld.types, i, t)
> > > -    free_lang_data_in_type (t);
> > > +    {
> > > +      free_lang_data_in_type (t);
> > > +      while (TYPE_NEXT_VARIANT (t)
> > > +	     && !fld.pset.contains (TYPE_NEXT_VARIANT (t)))
> > > +	TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
> > > +    }
> > >    if (flag_checking)
> > >      {
> > >        FOR_EACH_VEC_ELT (fld.types, i, t)
> > > 
> > > 
> > 
> > -- 
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]