Enable TBAA on anonymous types with LTO

Mon Sep 29 15:36:00 GMT 2014

> 
> Why not just make all anonymous types their own canonical type?
> (of course considering type variants)

If C++ FE sets canonical type always to main variant, it should work.
Is it always the case? I noticed you do this for variadic types.
I tought there is reason why canonical types differ from main variants
and the way canonical types are built in FE seems more complex too, than
just setting CANONICAL=MAIN_VARIANT
> 
> >   2) update canonical type hash so it can deal with types that already have
> >      canonical type set.
> >      I insert even anonymous types there because I am not able to get rid
> >      of cases where non-anonmous type explicitly mentions anonymous. Consider:
> >       namespace {
> >       struct B {};
> >       }
> >       struct A
> >       {
> > 	void t(B);
> > 	void t2();
> >       };
> >       void
> >       A::t(B)
> >       {
> >       }
> >       void
> >       A::t2()
> >       {
> >       }
> >      Here we end up having type of method T non-anonymous but it builds from B that
> >      is anonymous.
> 
> But that makes B non-anonymous as well?  How is A::t mangled?  Consider 
> also the simpler case
> 
> struct A { struct B b; };

Yep, A seems to be not anonymous and mangled as A.  I think it is ODR violation
to declare such type in more than one compilation unit (and we will warn on
it). We can make it anonymous, but I think it is C++ FE to do so.
> 
> >      Being bale to handle non-upwards closed cases will be needed soon for full ODR
> >      type handling
> >   3) Disable tree merging of anonymous namespace nodes and anonymous types.  The second
> >      is needed, because I can have two identically looking anonymous types from
> >      same unit with different canonical types.
> 
> But the container should be distinct?  Isn't this again the issue that
> we merge anonymous namespace decls?  Please try to fix that one and 
> forall.

I already made anonymous namespaces unmerged in this patch. But still I saw
cases where two anonymous namespace types differed only by TYPE_CANONICAL and
had DFS components of different size.  Because we do not compare TYPE_CANONICAl
we declared them equivalent and DFS merging ICEd.
> 
> >      This may go away once we get some ability to decide on unmergability at
> >      stream out time.
> > 
> > I do not attept to merge anonymous types with structurally equivalent
> > non-anonymous types from other compilation units.  I think it is nature of C++
> > language that types in anonymous namespaces can not be accessed by other units
> > and I hope to use this for other optimizations, too.
> 
> What about cross-language LTO?  With your scheme you say that you can't
> ever interoperate using "anonymous" entities (even if used from a
> non-anonymous one like in the examples above)?
> 
> I think that's a dangerous route to go.  Maybe detect the case where
> we compile from multiple source languages and behave differently?

I really think that anonymous types are meant to not be accessible from other
compilation unit and I do not see why other languages need different rule.

Of course for proper ODR types we need to solve how to merge non-C++ types with
them.  So I can do that for anonymous types, too.  I had to revisit my original
plan for canonical types for ODR.  I originally just insterted ODR types into
ODR hash as well as canonical type hash (without killing their canonical type).
While doing so I recordedif given canonical type has non-ODR type in it.
Finally I went through ODR types and updated their canonical types by canonical
hash if the conflict happen.

This does not work for types build from ODR types that arenot ODR themselves.
My plan is:
  1) fork current canonical_type hash into two - structural_type_hash and
     canonical_type_hash
  2) During the streaming in, I populate structural_type_hash and the existing
     odr_hash.  I record what structural type hash buckets contains non-ODR
     type
  3) During the existing loop that recomputes canonical type I start populating
     canonical type hash.  It works same way as structural except for ODR
     types that do not conflict. THose are considered by ODR equality.
> 
> > We can add documentation about this to -fstrict-aliasing section of 
> > manual I guess.
> > 
> > What I am concerned about is the needed change in c-decl.c.  C frontend currently
> > outputs declarations that are confused by type_in_anonymous_namespace_p as anonymous
> > in some cases.  This is because it does not set PUBLIC flag on TYPE decl.  This is
> > bug:
> > /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
> >    nonzero means name is to be accessible from outside this translation unit.
> >    In an IDENTIFIER_NODE, nonzero means an external declaration
> >    accessible from outside this translation unit was previously seen
> >    for this name in an inner scope.  */
> > #define TREE_PUBLIC(NODE) ((NODE)->base.public_flag)
> > 
> > This fortunately manifests itself as false warnings about type incompatiblity
> > from lto-symtab.  I did not see these with other languages, but I suppose we will
> > want to check that other FEs are behaving correctly here.
> > I do not know how Ada and Fortran should behave here.
> > 
> > Bootstrapped/regtested x86_64-linux, lto-bootstrapped, tested with Firefox and
> > libreoffice. I also checked that tree merging is working still well. OK?
> 
> Not in this form, we have to discuss this further.  It's way too agressive
> in my view.

Yep, I expected that, but we need to start somewhere ;))
> 
> I don't like changing the hashing/registering machinery this way.
> 
> Can you please introduce a function that enters both hash and type
> into the tables (asserting they are not already there) and call that
> explicitely for anonymous types from lto_read_decls ...

OK, so having something like register_odr_type doing both?
Of course multiple ODR types may have same canonical type hash, so I can not
just assert they are not already there.

I believe the TYPE_CANONICAl check originally prevented ICE in the case
where strongly connected component is ordered in a way so registering earlier
type into canonical type hash makes it recursively regiser later type.
So it may happen that the type is already registered at a time lto_read_decls
walks acorss it.
> >  	compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2));
> > -      /* TYPE_CANONICAL is re-computed during type merging, so do not
> > -	 compare it here.  */
> > +      /* TYPE_CANONICAL of non-anonymous types is re-computed during type
> > + 	 merging, so do not compare it here.  */
> >        compare_tree_edges (TYPE_STUB_DECL (t1), TYPE_STUB_DECL (t2));
> > +
> > +      /* Two anonymous namespace types from the same unit still should
> > +	 not be unified.  */
> > +      if (type_in_anonymous_namespace_p (t1))
> > +	return false;
> 
> Well, then exempt them from the merging process?  And compute the
> property properly by also treating compositing with anon types
> as anonymous.  (the patch somewhat feels like being the 2nd in a
> series of two ...)  Didn't you have patches to do that during
> SCC discovery at compile-time?  What happened to them?

Yes, I had patches for that. You told you will think about best implementatoin
later.  Later did not seem to happen yet :)
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00614.html
seems to be end of the thread.

Yes, having that patch makes it possible to dropthese checks.
> 
> >      }
> >  
> >    if (CODE_CONTAINS_STRUCT (code, TS_TYPE_NON_COMMON))
> > @@ -1909,9 +1947,7 @@ lto_read_decls (struct lto_file_decl_dat
> >  		  num_prevailing_types++;
> >  		  lto_fixup_prevailing_type (t);
> >  		}
> > -	      /* Compute the canonical type of all types.
> > -		 ???  Should be able to assert that !TYPE_CANONICAL.  */
> > -	      if (TYPE_P (t) && !TYPE_CANONICAL (t))
> > +	      if (TYPE_P (t))
> 
> ... here as else { ...?

I am not sure what you mean here?
> 
> >  		{
> >  		  gimple_register_canonical_type (t);
> >  		  if (odr_type_p (t))
> > Index: tree-streamer-out.c
> > ===================================================================
> > --- tree-streamer-out.c	(revision 215645)
> > +++ tree-streamer-out.c	(working copy)
> > @@ -708,8 +708,13 @@ write_ts_type_common_tree_pointers (stru
> >       during fixup.  */
> >    stream_write_tree (ob, TYPE_MAIN_VARIANT (expr), ref_p);
> >    stream_write_tree (ob, TYPE_CONTEXT (expr), ref_p);
> > -  /* TYPE_CANONICAL is re-computed during type merging, so no need
> > -     to stream it here.  */
> > +  /* For non-anonymous types, TYPE_CANONICAL is re-computed during type
> > +     merging, so no need to follow it here.  */
> > +  if (TYPE_CANONICAL (expr)
> > +      && type_in_anonymous_namespace_p (TYPE_CANONICAL (expr)))
> > +    stream_write_tree (ob, TYPE_CANONICAL (expr), ref_p);
> > +  else
> > +    stream_write_tree (ob, NULL_TREE, ref_p);
> 
> We can compute type_in_anonymous_namespace_p in lto_read_decls
> where we can simply set TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t).
> That's more light-weight on streaming.

Yes, if that it is always correct canonical type, we could do it.  Just go with a flag.
> 
> >    stream_write_tree (ob, TYPE_STUB_DECL (expr), ref_p);
> >  }
> >  
> > Index: lto-streamer-in.c
> > ===================================================================
> > --- lto-streamer-in.c	(revision 215645)
> > +++ lto-streamer-in.c	(working copy)
> > @@ -1104,8 +1104,8 @@ lto_read_body_or_constructor (struct lto
> >  
> >  	      if (TYPE_P (t))
> >  		{
> > -		  gcc_assert (TYPE_CANONICAL (t) == NULL_TREE);
> > -		  TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t);
> > +		  if (TYPE_CANONICAL (t))
> > +		    TYPE_CANONICAL (t) = TYPE_CANONICAL (TYPE_CANONICAL (t));
> 
> Eh??!!  One more reason to not trust broken canonical types from the
> C++ FE (the FE uses this for its very own thing, not really TBAA).

Sorry, this is a forgotten hunk.  I was originally streaming TYPE_CANONICAL
of all anonymous namespace type and was wondering about scenario where
TYPE_CANONICAL of anonymous namespace is not anonymous and thus structurally
merged. For such types we would compute new TYPE_CANONICAL and introduce this bug.
I later asserted that this never happens, but forgot it here.

Honza
> 
> >  		  if (TYPE_MAIN_VARIANT (t) != t)
> >  		    {
> >  		      gcc_assert (TYPE_NEXT_VARIANT (t) == NULL_TREE);
> > Index: tree-streamer-in.c
> > ===================================================================
> > --- tree-streamer-in.c	(revision 215645)
> > +++ tree-streamer-in.c	(working copy)
> > @@ -804,8 +804,7 @@ lto_input_ts_type_common_tree_pointers (
> >       during fixup.  */
> >    TYPE_MAIN_VARIANT (expr) = stream_read_tree (ib, data_in);
> >    TYPE_CONTEXT (expr) = stream_read_tree (ib, data_in);
> > -  /* TYPE_CANONICAL gets re-computed during type merging.  */
> > -  TYPE_CANONICAL (expr) = NULL_TREE;
> > +  TYPE_CANONICAL (expr) = stream_read_tree (ib, data_in);
> >    TYPE_STUB_DECL (expr) = stream_read_tree (ib, data_in);
> >  }
> >  
> > 
> > 
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer