Check canonical types in verify_type

Jan Hubicka hubicka@ucw.cz
Thu May 21 17:18:00 GMT 2015


> > Hmm, I see, interesting hack.  For the first part of comment, I see that
> > qualifiers needs to be ignored, but I do not see why we put
> > short * and int * pointers to same class.
> 
> For the reason that people are very lazy.  For example GCC has code
> punning void ** and void * and void * to Type * (and vice versa).  People
> just don't expect pointers to be in different alias sets (and there is
> little gained with doing that).

I think this observation may be out of date.  Removing that code (punting all
pointer to ptr_type_node) increase number of TBAA disambiguations for firefox
by 20%. It also drops number of queries so in reality the number is higher.

So it seems that it may be worth to track this in more sensible way.  This was
tested on LTO where most pointer alias sets are te same anyway, so the increase
should be bigger in non-LTO build. I will gather some stats.

The issue is that modern C++ code packs everything into instances, puts pointers
to instances everywhere and as wel inline everything into one glob, we could
track down a lot of data if we did not get lost in pointers.

We may want to have flag disabling that and we may want to throw away some
precision without throwing away all.

I tried to just trhow away the qualifier from pointer (i.e. make pointer to
qualified type to be globbed to pointer to unqualified type). This seems to
work in a way bootstrapping GCC with one bug (in ipa-icf) and passing
testsuite.
> 
> > The complete/incomplete type fun with LTO can be solved for C++ but indeed I
> > see why pointer to incomplete type needs to be considered compatible with every
> > other pointer to structure.  Can't this be dealt with by adding correct edges
> > into the aliasing dag?
> 
> I don't think so, without the dag degenerating.  We've had this mess
> before (complete vs. incomplete structs and so).  It didn't work very 
> well.

Can you point me to the code/issues, please?
> 
> I don't remember if I added a testcase, so no.  It's just very clear to me
> that the look of pointer members may be very different (consider 
> reference vs. pointer or pointer to array vs pointer to element).

Yep, depending on how much of cross-lanugage datastructures we want to permit.
We could definitely unpeel those differences just like I did with qualifiers.
However this trick is kind of weird because it does not propagate up to aggregates
buildt from these.

In a way some of this sould probably better be handled at canonical type
calculation time if we really want to permit mixing up aggregates with those
differences.

For example I think it would make sense to look throught ARRAY_TYPEs when
handling determining pointer canonical type (so pointer to array and pointer to
element are the same).  We also may ignore TREE_CODE match for POINTER_TYPE_P
if we really do have languages that mixes that.

reference types are used by Ada and Fortran (not Java), so it depends on how
these interface to C. I suppose you are right that these ought to be compatible
with C pointers.  I will dig into respective language standards.
> 
> > > > Other issue I run into is that for Ada bootstrap we have variadic type whose
> > > > canonical types are having different temporary set as size.  I think this is
> > > > valid and perhaps gimple_canonical_types_compatible_p should consider
> > > > variadic arrays to be compatible with any array of compatible type?
> > > 
> > > Those are all local types and thus the strict equality compare should be
> > > ok?  Not sure if we can do in C sth like
> > > 
> > > void foo (int n)
> > > {
> > >   struct { int i; int a[n]; } a;
> > >   struct { int i; int a[n]; } b;
> > > }
> > > 
> > > and if we'd have to assign the same alias-sets to 'a' and 'b'.
> > 
> > No idea here either. I wonder if the types are intedned to be TBAA compatible
> > across two calls of the function.  In that case we may introduce multiple copies
> > of body by early inlining already that may be a problem.
> 
> No idea.  Well, the canonical type machinery was supposed to be 
> conservative but here we're obviously not 100% conservative.

I will look into producing a testcases and lets see.
> 
> > > > I am not quite convinced we get variadic types right at LTO time, because
> > > > they bypass canonical type calculation anyway and their canonical type
> > > > is set by TYPE_MAIN_VARIANT in lto_read_body_or_constructor which I think
> > > > is not safe.  I will look for a testcase.
> > > 
> > > That is because if they are streamed locally they do not enter type
> > > merging, but they still go via gimple_register_canonical_type, so I'm not
> > > sure where you see they always get their main variant as canonical type.
> > 
> > I tought these are handled by:
> >       /* And fixup types we streamed locally.  */
> >         {
> >           struct streamer_tree_cache_d *cache = data_in->reader_cache;
> >           unsigned len = cache->nodes.length ();
> >           unsigned i;
> >           for (i = len; i-- > from;)
> >             {
> >               tree t = streamer_tree_cache_get_tree (cache, i);
> >               if (t == NULL_TREE)
> >                 continue;
> > 
> >               if (TYPE_P (t))
> >                 {
> >                   gcc_assert (TYPE_CANONICAL (t) == NULL_TREE);
> >                   TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t);
> >                   if (TYPE_MAIN_VARIANT (t) != t)
> >                     {
> >                       gcc_assert (TYPE_NEXT_VARIANT (t) == NULL_TREE);
> >                       TYPE_NEXT_VARIANT (t)
> >                         = TYPE_NEXT_VARIANT (TYPE_MAIN_VARIANT (t));
> >                       TYPE_NEXT_VARIANT (TYPE_MAIN_VARIANT (t)) = t;
> >                     }
> >                 }
> >             }
> >         }
> > 
> > which does not do registering.
> 
> That's your code, not mine - but yes, it looks like local types are
> streamed when we stream the trees refering to them.  And the canonical

No, it is not mine ;) I only added the code copying fields to variants
and then reverted it per your request, so i suppose SVN blame me :).
 Probably the code comes from original LTO branch.

> type hash is probably teared down at this point.  The code should

Yep, but we can also keep it and just add the types - it is incremental
in its nature.

> probably call lto_fixup_prevailing_type to also fix 
> pointer-to/reference-to chains (and it will deal with main variants
> as well).  IMHO the canonical type setting here is ok (during inlining
> when we copy variable-length types we should end up sharing TYPE_CANONICAL
> IMHO).  AFAICS we do, via copy_node which doesn't reset TYPE_CANONICAL.

Yep, I think canonical types of arrays are merged by C FE if their TREE_TYPE
and TYPE_DOMAIN match and then they are just copied by inliner.
So variable length arrays from different copies of an inline functions are
considered compatible without LTO, while with LTO it depends on early
inliing... 
I am also bit worried about canonical types for C++ where these are
not always defined by their TYPE_MAIN_VARIANTs. I will look into this
in detail.

Honza



More information about the Gcc-patches mailing list