RFC: variant and ODR based type merging during LTO streaming
Richard Biener
rguenther@suse.de
Mon Oct 1 12:13:00 GMT 2018
On Mon, 1 Oct 2018, Richard Biener wrote:
> On Mon, 1 Oct 2018, Jan Hubicka wrote:
>
> > >
> > > The ODR savings really look good but as you say the implementation is
> > > somewhat "tricky".
> > >
> > > I'd like to see the type-variant done separately (of course) and
> > > also differently. This is because when enabling
> > > free-lang-data by default we should be able to get benefits for
> > > non-LTO compilations as well. Specifically I'd like us to
> > > (in free-lang-data) "canonicalize" existing variants applying the
> > > rules you derived for the lazier compare. At the point you then
> > > drop non-fld-walked variants we could weed out the resulting
> > > duplicates, keeping only a prevailing one in the list and (ab-)using
> > > GC to fixup references. Like with ggc_register_fixup (void *from, void
> > > *to) which would when following edges at mark time, adjust them
> > > according to this map (I'm not sure if we'd want to do it manually
> > > and record a vector of possibly affected TREE_TYPE uses during the fld
> > > walk). We could also (easily?) optimize the streaming itself by
> > > streaming only differences to the main variant of a type (but then
> > > we could change our tree data structure to make that trivial).
> >
> > I had patch for streaming the differences only some time ago. My recollection
> > is that we got it into tree and then reverted as there was some issues with
> > cycles. I can look into this again.
> >
> > I would really like to avoid using ggc for IL rewriting. One reason is
> > that eventually we would like to see GGC to go and thus we should not
> > wire it more into the essential parts of the compiler and other is that
> > GGC is often not run.
>
> Heh.
>
> > Saving type variants seems to have relatively minor effect on the IL size, so
> > we need to have solution that is not too expensive to be justified. I suppose
> > free lang data is sort of visiting all the relevant datastructures for
> > middle-end so we could do rewriting there. It would also make sense to
> > canonicalize types more based on knowledge whether they are used for memory
> > access (i.e. for non-accesses go to main variant and perhaps invent something
> > like main variant WRT useless conversions).
>
> I guess for that it would be better to put things like memory access
> alignment into the memory-references rather than using TYPE_ALIGN.
> Likewise for other things affecting semantics (TYPE_REF_CAN_ALIAS_ALL).
>
> Being able to simply drop to TYPE_MAIN_VARIANT would be very appealing...
> (and simplify things). The hard thing is to figure out where we look
> into those variant differences during late compilation...
>
> Type variants was always my first "easy" middle-end type related cleanup.
> And for non-LTO ODR merging probably gets us nothing (the FE should do
> the merging).
>
> > We could ggc_free the old variant and then ggc will ICE on dangling pointers.
> > I can give it a try with my patch to prune the variant tree.
>
> Yes, definitely do that. I also _really_ like us to do FLD
> unconditionally - maybe we can start by guarding individual pieces with
> a for_lto flag we pass down. But esp. disabling langhooks and stuff like
> the variant type purging would be good to get enabled for non-LTO
> to not have that modes diverge more and more...
>
> > >
> > > I wonder how much "ODR" merging we'd get for free when we canonicalize
> > > types some more - that is, how do ODR equal but tree unequal types
> > > usually differ? I guess it might be most of the time fields with
> > > pointer to incomplete vs. complete type?
> >
> > Most of the time it is complete vs incomplete pointer type.
> > I had statistics of type duplicates before, but they did not look as bad
> > as reality. The reason is that smaller types tends to be merged while
> > bigger types tends to be duplicated many times. In GCC this is typically
> > RTL, gimple and derived types which are really many.
>
> I see. So one possible canonicalization is to make _all_
> pointer-typed FIELD_DECLs point to incomplete variants since the memory
> accesses should already have the "proper" access types. Can you
> get statistics on that? Not sure how to get an "incomplete" type
> though (iff we can simply copy the type and NULL TYPE_FIELDs and
> TYPE_SIZE and friends) - again I'd do that at FLD time.
So sth like
tp = build_distinct_type_copy (t);
TYPE_FIELDS (tp) = NULL_TREE;
TYPE_SIZE (tp) = NULL_TREE;
TYPE_SIZE_UNIT (tp) = NULL_TREE;
tp = type_hash_canon (tp);
of course we "leak" the original type in used COMPONENT_REFs
(may also cause some verifier ICEs here if the types mismatch that
of the FIELD_DECLs) and in aggregate copies, etc. But I wonder
how much "unused" unnecessary types we have. That is, I'd paper
over the ICEs this causes and not fixup the IL stream at first for
example.
Richard.
> Richard.
>
> > Honza
> > >
> > > Thanks,
> > > Richard.
> > >
> > >
> > > > Honza
> > > >
> > > > Index: ipa-devirt.c
> > > > ===================================================================
> > > > --- ipa-devirt.c (revision 264689)
> > > > +++ ipa-devirt.c (working copy)
> > > > @@ -2111,6 +2111,29 @@ get_odr_type (tree type, bool insert)
> > > > return val;
> > > > }
> > > >
> > > > +/* Return the main variant of the odr type. This is used for straming out
> > > > + to reduce number of type duplicates hitting the WPA->LTRANS streams.
> > > > + Do not do so when ODR violation was detected since the type may be
> > > > + structurally different then. */
> > > > +
> > > > +tree
> > > > +prevailing_odr_type (tree t)
> > > > +{
> > > > + t = TYPE_MAIN_VARIANT (t);
> > > > + /* In need_assembler_name_p we also mangle assembler names of INTEGER_TYPE.
> > > > + We can not merge these because this does not honnor precision and
> > > > + signedness. */
> > > > + if (!type_with_linkage_p (t)
> > > > + || type_in_anonymous_namespace_p (t)
> > > > + || TREE_CODE (t) == INTEGER_TYPE
> > > > + || !COMPLETE_TYPE_P (t))
> > > > + return t;
> > > > + odr_type ot = get_odr_type (t, true);
> > > > + if (!ot || !ot->odr_violated)
> > > > + return ot->type;
> > > > + return t;
> > > > +}
> > > > +
> > > > /* Add TYPE od ODR type hash. */
> > > >
> > > > void
> > > > Index: ipa-utils.h
> > > > ===================================================================
> > > > --- ipa-utils.h (revision 264689)
> > > > +++ ipa-utils.h (working copy)
> > > > @@ -90,6 +90,7 @@ void warn_types_mismatch (tree t1, tree
> > > > location_t loc2 = UNKNOWN_LOCATION);
> > > > bool odr_or_derived_type_p (const_tree t);
> > > > bool odr_types_equivalent_p (tree type1, tree type2);
> > > > +tree prevailing_odr_type (tree t);
> > > >
> > > > /* Return vector containing possible targets of polymorphic call E.
> > > > If COMPLETEP is non-NULL, store true if the list is complete.
> > > > Index: lto/lto.c
> > > > ===================================================================
> > > > --- lto/lto.c (revision 264689)
> > > > +++ lto/lto.c (working copy)
> > > > @@ -485,6 +485,8 @@ gimple_register_canonical_type_1 (tree t
> > > > static void
> > > > gimple_register_canonical_type (tree t)
> > > > {
> > > > + if (flag_checking)
> > > > + verify_type (t);
> > > > if (TYPE_CANONICAL (t) || !type_with_alias_set_p (t)
> > > > || !canonical_type_used_p (t))
> > > > return;
> > > > Index: lto-streamer-out.c
> > > > ===================================================================
> > > > --- lto-streamer-out.c (revision 264689)
> > > > +++ lto-streamer-out.c (working copy)
> > > > @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.
> > > > #include "debug.h"
> > > > #include "omp-offload.h"
> > > > #include "print-tree.h"
> > > > +#include "attribs.h"
> > > > +#include "ipa-utils.h"
> > > >
> > > >
> > > > static void lto_write_tree (struct output_block*, tree, bool);
> > > > @@ -109,14 +111,188 @@ destroy_output_block (struct output_bloc
> > > > free (ob);
> > > > }
> > > >
> > > > +/* Do same comparsion as check_qualified_type skipping lang part of type
> > > > + and be more permissive about type names: we only care that names are
> > > > + same (for diagnostics) and that ODR names are the same. */
> > > >
> > > > -/* Look up NODE in the type table and write the index for it to OB. */
> > > > +static bool
> > > > +types_equal_p (tree t, tree v)
> > > > +{
> > > > + if (TYPE_QUALS (t) != TYPE_QUALS (v))
> > > > + return false;
> > > > +
> > > > + if (TYPE_NAME (t) != TYPE_NAME (v)
> > > > + && (!TYPE_NAME (t) || !TYPE_NAME (v)
> > > > + || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL
> > > > + || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL
> > > > + || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t))
> > > > + != DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v))
> > > > + || DECL_NAME (TYPE_NAME (t)) != DECL_NAME (TYPE_NAME (v))))
> > > > + return false;
> > > > +
> > > > + if (TYPE_ALIGN (t) != TYPE_ALIGN (v))
> > > > + return false;
> > > > +
> > > > + if (!attribute_list_equal (TYPE_ATTRIBUTES (t),
> > > > + TYPE_ATTRIBUTES (v)))
> > > > + return false;
> > > > +
> > > > + /* Do not replace complete type by incomplete. */
> > > > + if (COMPLETE_TYPE_P (t) != COMPLETE_TYPE_P (v))
> > > > + return false;
> > > > +
> > > > + if (TYPE_ADDR_SPACE (t) != TYPE_ADDR_SPACE (v))
> > > > + return false;
> > > > +
> > > > + gcc_assert (TREE_CODE (t) == TREE_CODE (v));
> > > > +
> > > > + /* For pointer types and array types we also care about the type they
> > > > + reffer to. */
> > > > + if (TREE_TYPE (t))
> > > > + return types_equal_p (TREE_TYPE (t), TREE_TYPE (v));
> > > > +
> > > > + return true;
> > > > +}
> > > > +
> > > > +/* Search list of type variants and look up first one that looks same.
> > > > + At compile time, this removes duplicates of types created by front-ends.
> > > > + At WPA time we also merge duplicated ODR types. */
> > > > +
> > > > +static tree
> > > > +first_compatible_type_variant (tree t)
> > > > +{
> > > > + tree first = NULL;
> > > > + if (POINTER_TYPE_P (t))
> > > > + {
> > > > + tree t2 = first_compatible_type_variant (TREE_TYPE (t));
> > > > + if (t2 != TREE_TYPE (t))
> > > > + {
> > > > + if (TREE_CODE (t) == POINTER_TYPE)
> > > > + first = build_pointer_type_for_mode (t2, TYPE_MODE (t),
> > > > + TYPE_REF_CAN_ALIAS_ALL (t));
> > > > + else
> > > > + first = build_reference_type_for_mode (t2, TYPE_MODE (t),
> > > > + TYPE_REF_CAN_ALIAS_ALL (t));
> > > > + }
> > > > + else
> > > > + first = TYPE_MAIN_VARIANT (t);
> > > > + gcc_assert (TYPE_MAIN_VARIANT (first) == first);
> > > > + }
> > > > + else
> > > > + first = prevailing_odr_type (t);
> > > > +
> > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, first, false));
> > > > +
> > > > + for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
> > > > + if (types_equal_p (t, v))
> > > > + {
> > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, false));
> > > > + gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == TYPE_CANONICAL (v));
> > > > + gcc_checking_assert (get_alias_set (t) == get_alias_set (v));
> > > > + return v;
> > > > + }
> > > > +
> > > > + /* Most of the time we will find v itself in the list. With ODR type merging
> > > > + we however merge across TYPE_MAIN_VARIANTs and in that case we may not
> > > > + have corresponding type in the list. */
> > > > + tree v = build_variant_type_copy (first);
> > > > + TYPE_READONLY (v) = TYPE_READONLY (t);
> > > > + TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
> > > > + TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
> > > > + TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
> > > > + TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
> > > > + SET_TYPE_ALIGN (v, TYPE_ALIGN (t));
> > > > + TYPE_NAME (v) = TYPE_NAME (t);
> > > > + TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
> > > > + TREE_TYPE (v) = TREE_TYPE (t);
> > > > + gcc_checking_assert (types_equal_p (t, v));
> > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, false));
> > > > + gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == TYPE_CANONICAL (v));
> > > > + gcc_checking_assert (get_alias_set (t) == get_alias_set (v));
> > > > + return v;
> > > > +}
> > > > +
> > > > +
> > > > +/* Look up NODE in the type table and write the index for it to OB.
> > > > + Eliminate unnecesary type variants. */
> > > >
> > > > static void
> > > > output_type_ref (struct output_block *ob, tree node)
> > > > {
> > > > + bool existed_p;
> > > > + unsigned int &index
> > > > + = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table->
> > > > + get_or_insert (node, &existed_p);
> > > > streamer_write_record_start (ob, LTO_type_ref);
> > > > - lto_output_type_ref_index (ob->decl_state, ob->main_stream, node);
> > > > +
> > > > + if (existed_p)
> > > > + {
> > > > + streamer_write_uhwi_stream (ob->main_stream, index);
> > > > + return;
> > > > + }
> > > > +
> > > > + tree node2 = first_compatible_type_variant (node);
> > > > +
> > > > + /* If NODE is first variant, just add it into encoder. */
> > > > + if (node2 == node)
> > > > + {
> > > > + index = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length ();
> > > > + if (streamer_dump_file)
> > > > + {
> > > > + print_node_brief (streamer_dump_file,
> > > > + " Encoding indexable ", node, 4);
> > > > + fprintf (streamer_dump_file, " as %i\n", index);
> > > > + }
> > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push (node);
> > > > + streamer_write_uhwi_stream (ob->main_stream, index);
> > > > + return;
> > > > + }
> > > > +
> > > > + index = 0xdead;
> > > > +
> > > > + if (streamer_dump_file)
> > > > + {
> > > > + print_node_brief (streamer_dump_file, " Type ", node, 4);
> > > > + fprintf (streamer_dump_file, " prevailed by ");
> > > > + print_node_brief (streamer_dump_file, "", node2, 4);
> > > > + fprintf (streamer_dump_file, "\n");
> > > > + }
> > > > +
> > > > + /* See if we already assgined one to the first variant. If that is the case
> > > > + only duplicate the entry in the hashtable so we do not need to call
> > > > + first_compatible_type_variant redundantly. */
> > > > + unsigned int &index2
> > > > + = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table->
> > > > + get_or_insert (node2, &existed_p);
> > > > + /* The reference may be changed by hash table expansion. */
> > > > + unsigned int i = index2;
> > > > +
> > > > + if (existed_p)
> > > > + {
> > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].
> > > > + tree_hash_table->get_or_insert (node, &existed_p) = i;
> > > > + streamer_write_uhwi_stream (ob->main_stream, i);
> > > > + return;
> > > > + }
> > > > + else
> > > > + {
> > > > + index2 = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length ();
> > > > + i = index2;
> > > > + if (streamer_dump_file)
> > > > + {
> > > > + print_node_brief (streamer_dump_file,
> > > > + " Encoding indexable ", node2, 4);
> > > > + fprintf (streamer_dump_file, " as %i \n", i);
> > > > + }
> > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push (node2);
> > > > + /* We need to search again to watch hash table resize. */
> > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].
> > > > + tree_hash_table->get_or_insert (node, &existed_p) = i;
> > > > + gcc_assert (existed_p);
> > > > + streamer_write_uhwi_stream (ob->main_stream, i);
> > > > + }
> > > > + return;
> > > > }
> > > >
> > > >
> > > > @@ -978,12 +1171,15 @@ hash_tree (struct streamer_tree_cache_d
> > > > #define visit(SIBLING) \
> > > > do { \
> > > > unsigned ix; \
> > > > - if (!SIBLING) \
> > > > + tree t2 = SIBLING; \
> > > > + if (t2 && TYPE_P (t2)) \
> > > > + t2 = first_compatible_type_variant (t2); \
> > > > + if (!t2) \
> > > > hstate.add_int (0); \
> > > > - else if (streamer_tree_cache_lookup (cache, SIBLING, &ix)) \
> > > > + else if (streamer_tree_cache_lookup (cache, t2, &ix)) \
> > > > hstate.add_int (streamer_tree_cache_get_hash (cache, ix)); \
> > > > else if (map) \
> > > > - hstate.add_int (*map->get (SIBLING)); \
> > > > + hstate.add_int (*map->get (t2)); \
> > > > else \
> > > > hstate.add_int (1); \
> > > > } while (0)
> > > > @@ -1554,6 +1750,10 @@ DFS::DFS_write_tree (struct output_block
> > > > if (this_ref_p && tree_is_indexable (expr))
> > > > return;
> > > >
> > > > + /* Replace type by its prevailing variant. */
> > > > + if (TYPE_P (expr))
> > > > + expr = first_compatible_type_variant (expr);
> > > > +
> > > > /* Check if we already streamed EXPR. */
> > > > if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL))
> > > > return;
> > > > @@ -1591,6 +1791,11 @@ lto_output_tree (struct output_block *ob
> > > > return;
> > > > }
> > > >
> > > > + if (TYPE_P (expr))
> > > > + expr = first_compatible_type_variant (expr);
> > > > +
> > > > + gcc_checking_assert (!this_ref_p || !tree_is_indexable (expr));
> > > > +
> > > > existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, &ix);
> > > > if (existed_p)
> > > > {
> > > > @@ -2334,7 +2539,18 @@ copy_function_or_variable (struct symtab
> > > > gcc_assert (lto_tree_ref_encoder_size (encoder) == 0);
> > > > encoder->trees.reserve_exact (n);
> > > > for (j = 0; j < n; j++)
> > > > - encoder->trees.safe_push ((*trees)[j]);
> > > > + {
> > > > + tree t = (*trees)[j];
> > > > + if (TYPE_P (t))
> > > > + t = first_compatible_type_variant (t);
> > > > + if (streamer_dump_file)
> > > > + {
> > > > + print_node_brief (streamer_dump_file,
> > > > + " Adding reference to ", t, 4);
> > > > + fprintf (streamer_dump_file, " as %i \n", (int)j);
> > > > + }
> > > > + encoder->trees.safe_push (t);
> > > > + }
> > > > }
> > > >
> > > > lto_free_raw_section_data (file_data, LTO_section_function_body, name,
> > > > @@ -2507,6 +2723,8 @@ write_global_stream (struct output_block
> > > > print_node_brief (streamer_dump_file, "", t, 4);
> > > > fprintf (streamer_dump_file, "\n");
> > > > }
> > > > + gcc_checking_assert (!TYPE_P (t)
> > > > + || t == first_compatible_type_variant (t));
> > > > if (!streamer_tree_cache_lookup (ob->writer_cache, t, NULL))
> > > > stream_write_tree (ob, t, false);
> > > > }
> > > > @@ -2537,6 +2755,8 @@ write_global_references (struct output_b
> > > > t = lto_tree_ref_encoder_get_tree (encoder, index);
> > > > streamer_tree_cache_lookup (ob->writer_cache, t, &slot_num);
> > > > gcc_assert (slot_num != (unsigned)-1);
> > > > + gcc_checking_assert (!TYPE_P (t)
> > > > + || t == first_compatible_type_variant (t));
> > > > data[index + 1] = slot_num;
> > > > }
> > > >
> > > > Index: tree-ssa-alias.c
> > > > ===================================================================
> > > > --- tree-ssa-alias.c (revision 264689)
> > > > +++ tree-ssa-alias.c (working copy)
> > > > @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.
> > > > #include "tree-dfa.h"
> > > > #include "ipa-reference.h"
> > > > #include "varasm.h"
> > > > +#include "ipa-utils.h"
> > > >
> > > > /* Broad overview of how alias analysis on gimple works:
> > > >
> > > > @@ -955,6 +956,12 @@ nonoverlapping_component_refs_of_decl_p
> > > > ??? Bitfields can overlap at RTL level so punt on them. */
> > > > if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2))
> > > > return false;
> > > > + /* In LTO we may end up merging different variants of ODR types
> > > > + but keep corresponding field decls unmerged. */
> > > > + if (in_lto_p
> > > > + && type_with_linkage_p (type1)
> > > > + && DECL_NAME (field1) == DECL_NAME (field2))
> > > > + return false;
> > > > return true;
> > > > }
> > > > }
> > > > @@ -1049,7 +1056,9 @@ nonoverlapping_component_refs_p (const_t
> > > > {
> > > > /* We're left with accessing different fields of a structure,
> > > > no possible overlap. */
> > > > - if (fieldx != fieldy)
> > > > + if (fieldx != fieldy
> > > > + && (in_lto_p || !type_with_linkage_p (typex)
> > > > + || DECL_NAME (fieldx) == DECL_NAME (fieldy)))
> > > > {
> > > > /* A field and its representative need to be considered the
> > > > same. */
> > > > Index: tree.c
> > > > ===================================================================
> > > > --- tree.c (revision 264689)
> > > > +++ tree.c (working copy)
> > > > @@ -5845,7 +5861,12 @@ free_lang_data_in_cgraph (void)
> > > >
> > > > /* Traverse every type found freeing its language data. */
> > > > FOR_EACH_VEC_ELT (fld.types, i, t)
> > > > - free_lang_data_in_type (t);
> > > > + {
> > > > + free_lang_data_in_type (t);
> > > > + while (TYPE_NEXT_VARIANT (t)
> > > > + && !fld.pset.contains (TYPE_NEXT_VARIANT (t)))
> > > > + TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
> > > > + }
> > > > if (flag_checking)
> > > > {
> > > > FOR_EACH_VEC_ELT (fld.types, i, t)
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >
> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
More information about the Gcc-patches
mailing list