Optimize type streaming

Richard Biener rguenther@suse.de
Fri Jul 11 09:43:00 GMT 2014


On Fri, 11 Jul 2014, Jan Hubicka wrote:

> > 
> > Well - as we re-use the streamer cache to store the hash value it isn't
> > really possible to do better ... (at least I don't have a clever idea)
> 
> OK, no cleverness on my side either.
> > 
> > Yeah (though you wouldn't need the extra hashing - we only need to know
> > the hash of the SCC).  The current iterating to iterate until no
> > duplicate hashes are found is bogous IMHO - better do the right thing
> > from the start or only iterate until at least _one_ hash is unique
> > (you get rid of the scc_entry_len > 1 case which can be quite expensive).
> 
> agreed. I just got this implemented before I got into enough of understading
> of the lto/lto.c side of the machinery - I assumed that identical nodes
> are tested deeply but that would of course be expensive
> 
> > Of course the question is whether we can always guarantee this property
> > ... if we can then the offline compare can be expanded to all trees
> > (it requires a stable sort of the SCCs independent on SCC entry).
> 
> I do not think we need stable sorting here.  This is what I think extra
> DFS walk should ensure - as soon as we have unique entry (by identifying first
> node w/o conflict), the DFS walks in all units will give same order.

Yes, that's what I call "stable sorting" of the SCC.

> I do not think we can 100% guarantee it, but we can get very close to 100%.
> If we won't find unique entry, I suppose we can stick with current scheme or
> just drop the complicated path with multiple entries and do not merge in this
> case (or just merge when the entry happens to match by accident).

Yeah, I suppose we want some statistics here (how many times we have
to iterate - I've once placed an assert to see wheter we have
SCCs with non-unique entry and there was none during LTO bootstrap
at least, with no iteration).

> > 
> > > This is because those otherwise equivalent trees are not really 
> > > equivalent from merging perspective. You test strict pointer equivalence 
> > > outside SCC regions, so it matters whether two same trees are referred 
> > > or two equivalent copies.
> > 
> > Hmm?  So you say you have two identical trees in the _same_ SCC?
> > How can that be?  Answer: not possible - they at least differ
> > in the edges forming the SCC (unless we have a cyclic list but
> > I don't think we (should) have that).
> 
> We hash only on outgoing SCC edges.  You can easily have
> main variant type T and variants T1,T2 that are same all used by type
> T again.

Ok, but then the two variants shouldn't be in the same SCC or at
least not in the same SCC as the main variant.

> T1 and T2 differs by references in T, but they will always have same
> hash and in theory they can be merged if your code worked recursively
> on nested SCC regions (as given by Tarjan's algorithm).  Because you work on
> outermost SCC regions only, we never merge T1 and T2.

I wonder what ties them into a SCC though.

> The congruence you compute is not best possible, but best possible 
> assuming no duplicates within compilation units.  I think this is sane, 
> we should look towards eliminating such duplicates rather than working 
> harder in the streamer.

Yeah.

> > 
> > Btw, if you re-hash only nodes with duplicate hashes might there be
> > a higher probability of ending up with a hash that collides with
> > one that was not a duplicate one?  That is, why not simply re-hash
> > all nodes?  (premature optimization...?)
> > 
> > So - can you can iterate on this re-doing the DFS walk with the
> > found entry node?
> 
> Yep, that is the plan. Iterate long enough until I get unqiue hash.  Can I
> easilly re-do the DFS walk?

Not that easily...  you have to refactor things a bit I fear ;)
DFS_write_tree is the main worker and the "init" is embedded
in lto_output_tree.  The main issue is that you can end up
with multiple SCCs from the start of the DFS walk.  So you'd
need to first collect all SCCs (but not write anything yet)
and then factor out a worker that processes them - you can
then iterate the DFS walk on the SCCs.

Quite some work eventually.  Not sure ;)

Richard.

> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > > Honza
> > > 
> > > 	* lto-streamer-out.c (hash_tree): Add map argument;
> > > 	remove special purpose hack for pointer types.
> > > 	(visit): Update it.
> > > 	(hash_scc): Add iteration until equivalence classes stabilize.
> > > Index: lto-streamer-out.c
> > > ===================================================================
> > > --- lto-streamer-out.c	(revision 212426)
> > > +++ lto-streamer-out.c	(working copy)
> > > @@ -690,13 +712,19 @@ DFS_write_tree_body (struct output_block
> > >  /* Return a hash value for the tree T.  */
> > >  
> > >  static hashval_t
> > > -hash_tree (struct streamer_tree_cache_d *cache, tree t)
> > > +hash_tree (struct streamer_tree_cache_d *cache, hash_map<tree, hashval_t> *map, tree t)
> > >  {
> > >  #define visit(SIBLING) \
> > >    do { \
> > >      unsigned ix; \
> > > -    if (SIBLING && streamer_tree_cache_lookup (cache, SIBLING, &ix)) \
> > > +    if (!SIBLING) \
> > > +      v = iterative_hash_hashval_t (0, v); \
> > > +    else if (streamer_tree_cache_lookup (cache, SIBLING, &ix)) \
> > >        v = iterative_hash_hashval_t (streamer_tree_cache_get_hash (cache, ix), v); \
> > > +    else if (map) \
> > > +      v = iterative_hash_hashval_t (*map->get (SIBLING), v); \
> > > +    else \
> > > +      v = iterative_hash_hashval_t (1, v); \
> > >    } while (0)
> > >  
> > >    /* Hash TS_BASE.  */
> > > @@ -896,23 +924,7 @@ hash_tree (struct streamer_tree_cache_d
> > >  
> > >    if (CODE_CONTAINS_STRUCT (code, TS_TYPED))
> > >      {
> > > -      if (POINTER_TYPE_P (t))
> > > -	{
> > > -	  /* For pointers factor in the pointed-to type recursively as
> > > -	     we cannot recurse through only pointers.
> > > -	     ???  We can generalize this by keeping track of the
> > > -	     in-SCC edges for each tree (or arbitrarily the first
> > > -	     such edge) and hashing that in in a second stage
> > > -	     (instead of the quadratic mixing of the SCC we do now).  */
> > > -	  hashval_t x;
> > > -	  unsigned ix;
> > > -	  if (streamer_tree_cache_lookup (cache, TREE_TYPE (t), &ix))
> > > -	    x = streamer_tree_cache_get_hash (cache, ix);
> > > -	  else
> > > -	    x = hash_tree (cache, TREE_TYPE (t));
> > > -	  v = iterative_hash_hashval_t (x, v);
> > > -	}
> > > -      else if (code != IDENTIFIER_NODE)
> > > +      if (code != IDENTIFIER_NODE)
> > >  	visit (TREE_TYPE (t));
> > >      }
> > >  
> > > @@ -1122,28 +1156,78 @@ scc_entry_compare (const void *p1_, cons
> > >  static hashval_t
> > >  hash_scc (struct streamer_tree_cache_d *cache, unsigned first, unsigned size)
> > >  {
> > > +  unsigned int last_classes = 0, iterations = 0;
> > > +
> > >    /* Compute hash values for the SCC members.  */
> > >    for (unsigned i = 0; i < size; ++i)
> > > -    sccstack[first+i].hash = hash_tree (cache, sccstack[first+i].t);
> > > +    sccstack[first+i].hash = hash_tree (cache, NULL, sccstack[first+i].t);
> > >  
> > >    if (size == 1)
> > >      return sccstack[first].hash;
> > >  
> > > -  /* Sort the SCC of type, hash pairs so that when we mix in
> > > -     all members of the SCC the hash value becomes independent on
> > > -     the order we visited the SCC.  Produce hash of the whole SCC as
> > > -     combination of hashes of individual elements.  Then combine that hash into
> > > -     hash of each element, so othewise identically looking elements from two
> > > -     different SCCs are distinguished.  */
> > > -  qsort (&sccstack[first], size, sizeof (scc_entry), scc_entry_compare);
> > > -
> > > -  hashval_t scc_hash = sccstack[first].hash;
> > > -  for (unsigned i = 1; i < size; ++i)
> > > -    scc_hash = iterative_hash_hashval_t (scc_hash,
> > > -					 sccstack[first+i].hash);
> > > -  for (unsigned i = 0; i < size; ++i)
> > > -    sccstack[first+i].hash = iterative_hash_hashval_t (sccstack[first+i].hash, scc_hash);
> > > -  return scc_hash;
> > > +  /* We may want to iterate multiple times across SCC propagating across the internal
> > > +     edges in order to get different hash values for individual trees.  */
> > > +  do
> > > +    {
> > > +      /* Sort the SCC of type, hash pairs so that when we mix in
> > > +	 all members of the SCC the hash value becomes independent on
> > > +	 the order we visited the SCC.  Produce hash of the whole SCC as
> > > +	 combination of hashes of individual elements.  Then combine that hash into
> > > +	 hash of each element, so othewise identically looking elements from two
> > > +	 different SCCs are distinguished.  */
> > > +      qsort (&sccstack[first], size, sizeof (scc_entry), scc_entry_compare);
> > > +
> > > +      unsigned int classes = 1;
> > > +      hashval_t scc_hash = sccstack[first].hash;
> > > +      for (unsigned i = 1; i < size; ++i)
> > > +        {
> > > +          if (sccstack[first+i-1].hash != sccstack[first+i].hash)
> > > +	    classes++;
> > > +          scc_hash = iterative_hash_hashval_t (scc_hash,
> > > +					       sccstack[first+i].hash);
> > > +        }
> > > +
> > > +      /* If we have no duplicated entries, or we no longer get refinements,
> > > +	 we are done.  */
> > > +      if (classes <= last_classes || classes == size || iterations > 16)
> > > +	{
> > > +	  for (unsigned i = 1; i < size - 1; ++i)
> > > +	    {
> > > +	      hashval_t hash = sccstack[first+i].hash;
> > > +
> > > +	      if (hash == sccstack[first+i+1].hash)
> > > +		for (;i < size && sccstack[first+i].hash == hash; i++)
> > > +		  debug_tree (sccstack[first+i].t);
> > > +	      else
> > > +		i++;
> > > +	    }
> > > +	  for (unsigned i = 0; i < size; ++i)
> > > +	    sccstack[first+i].hash = iterative_hash_hashval_t (sccstack[first+i].hash, scc_hash);
> > > +	  return scc_hash;
> > > +	}
> > > +
> > > +      /* Otherwise try to propagate hash values across the edges.  */
> > > +      last_classes = classes;
> > > +      iterations++;
> > > +      {
> > > +	hash_map <tree, hashval_t> map(size*2);
> > > +	for (unsigned i = 0; i < size; ++i)
> > > +	  map.put (sccstack[first+i].t, sccstack[first+i].hash);
> > > +
> > > +	for (unsigned i = 0; i < size - 1;)
> > > +	  {
> > > +	    hashval_t hash = sccstack[first+i].hash;
> > > +
> > > +	    /* Rehash only the entries that have duplicate hash values.  */
> > > +	    if (hash == sccstack[first+i+1].hash)
> > > +	      for (;i < size && sccstack[first+i].hash == hash; i++)
> > > +	        sccstack[first+i].hash = hash_tree (cache, &map, sccstack[first+i].t);
> > > +	    else
> > > +	      i++;
> > > +	  }
> > > +      }
> > > +    }
> > > +  while (true);
> > >  }
> > >  
> > >  /* DFS walk EXPR and stream SCCs of tree bodies if they are not
> > > 
> > > 
> > 
> > -- 
> > Richard Biener <rguenther@suse.de>
> > SUSE / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer



More information about the Gcc-patches mailing list