Do less generous pointer globbing in alias.c

Richard Biener rguenther@suse.de
Wed May 27 10:18:00 GMT 2015


On Wed, 27 May 2015, Jan Hubicka wrote:

> Hi, this patch makes it possible for non-LTO alias oracle to TBAA 
> disambiguate pointer types. It makes void * conflicting with all of them 
> and does not put it to alias set 0. It also preserves the property that 
> qualifiers of pointer-to type should not matter to determine the alias 
> set and that pointer to array is same as pointer to array element.  
> Finally it makes pointer void * to be equivalent to void ** (and more *) 
> and to types with structural equality only.

void * should be equivalent to incomplete-type * as well.

> I think those are all globbing rules we discussed for the non-LTO patch.
> 
> It does two things.  First is kind of "canonicalization" where for a given pointer
> it looks for non-pointer pointed-to type and then rebuilds is without qualifiers.
> This is fast, because build_pointer_type will reuse existing types.
> 
> It makes void * to conflict with everyting by making its alias set to be subset
> of alias set of any other pointer.  This means that writes to void * conflict
> with writes to any other pointer without really need to glob all the pointers
> to one equivalence class.

I think you need to make each pointer alias-set a subset of the one of 
void * as well because both of the following is valid:

  *(void *)p = ...
  ... = *(int *)p;

and

  *(int *)p = ...
  ... = *(void *)p;

not sure if it's possible to create a testcase that fails if you do
subsetting only one-way (because alias_sets_conflict queries both
ways and I think alias_set_subset_of is not used very much, only
by tree-ssa-alias.c:aliasing_component_refs_p which won't ever
use it on two pointer alias sets).  In theory true vs. anti-dependence
should use alias_set_subset_of and trigger the above cases.  But
as those queries are done wrong a lot (in the past?) we use
alias_sets_conflict there.

For efficiency you could use a new flag similar to has_zero_child
in alias_set_entry_d ... 

More comments inline below

> This patch makes quite some difference on C++.  For example on deal II the TBAA
> stats reports 4344358 disambiguations and 7008576 queries, while with the patch
> we get 5368737 and 5687399 queries (I did not chose deal II for reason, it is
> just random C++ file)
> 
> The patch bootstrap and regtests ppc64le-linux with the following testsuite
> differences:
> @@ -30,7 +30,9 @@
>  FAIL: c-c++-common/asan/null-deref-1.c   -O3 -g  output pattern test, is ASAN:SIGSEGV
>  FAIL: c-c++-common/asan/null-deref-1.c   -Os  output pattern test, is ASAN:SIGSEGV
>  FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors)
> +XPASS: gcc.dg/alias-8.c  (test for warnings, line 11)
>  FAIL: gcc.dg/loop-8.c scan-rtl-dump-times loop2_invariant "Decided" 1
> +FAIL: gcc.dg/pr62167.c scan-tree-dump-not pre "Removing basic block"
>  FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms "SMS succeeded" 1
>  XPASS: gcc.dg/guality/example.c   -O0  execution test
>  XPASS: gcc.dg/guality/example.c   -O1  execution test
> @@ -304,6 +306,9 @@
>  FAIL: c-c++-common/asan/null-deref-1.c   -O3 -g  output pattern test, is ASAN:SIGSEGV
>  FAIL: g++.dg/cpp1y/vla-initlist1.C  -std=gnu++11 execution test
>  FAIL: g++.dg/cpp1y/vla-initlist1.C  -std=gnu++14 execution test
> +FAIL: g++.dg/ipa/ipa-icf-4.C  -std=gnu++11  scan-ipa-dump icf "Equal symbols: [67]"
> +FAIL: g++.dg/ipa/ipa-icf-4.C  -std=gnu++14  scan-ipa-dump icf "Equal symbols: [67]"
> +FAIL: g++.dg/ipa/ipa-icf-4.C  -std=gnu++98  scan-ipa-dump icf "Equal symbols: [67]"
> 
> ipa-icf-4 is about alias info now being more perceptive to block the merging.
> pr62167 seems just confused.  The template checks that memory stores are not
> unified.  It looks for BB removal message, but with the patch we get:
>   <bb 2>:
>   node.next = 0B;
>   head.0_4 = head;
>   node.prev = head.0_4;
>   head.0_4->first = &node;
>   k.1_7 = k;
>   h_8 = &heads[k.1_7];
>   heads[2].first = 0B;
>   if (head.0_4 == h_8)
>     goto <bb 3>;
>   else
>     goto <bb 5>;
> 
>   <bb 5>:
>   goto <bb 4>;
> 
>   <bb 3>:
>   p_10 = MEM[(struct head *)&heads][k.1_7].first;
> 
>   <bb 4>:
>   # p_1 = PHI <p_10(3), &node(5)>
>   _11 = p_1 != 0B;
>   _12 = (int) _11;
>   return _12;
> 
> before PR, the message is about the bb 5 sitting at critical edge removed.
> The TBAA incompatible load it looks for is optimized away by FRE:
>   head->first = &node;
> 
>   struct node *n = head->first;
> 
>   struct head *h = &heads[k];
> 
>   heads[2].first = n->next;
> 
>   if ((void*)n->prev == (void *)h)
>     p = h->first;
>   else
>     /* Dead tbaa-unsafe load from ((struct node *)&heads[2])->next.  */
>     p = n->prev->next;
> 
> here n is known to be head->first that is known to be &node.
> The testcase runtime checks that result is Ok and passes.
> 
> Bootstrapped/regtested ppc64le-linux.
> 
> 	* alias.c (get_alias_set): Do not glob all pointer types into one;
> 	just produce euqivalence classes based on canonical type of pointed
> 	type type; make void * equivalent to void **.
> 	(record_component_aliases): Make void * to conflict with all other
> 	pointer types.
> Index: alias.c
> ===================================================================
> --- alias.c	(revision 223633)
> +++ alias.c	(working copy)
> @@ -903,35 +906,79 @@ get_alias_set (tree t)
>       the pointed-to types.  This issue has been reported to the
>       C++ committee.
>  
> -     In addition to the above canonicalization issue, with LTO
> -     we should also canonicalize `T (*)[]' to `T *' avoiding
> -     alias issues with pointer-to element types and pointer-to
> -     array types.
> -
> -     Likewise we need to deal with the situation of incomplete
> -     pointed-to types and make `*(struct X **)&a' and
> -     `*(struct X {} **)&a' alias.  Otherwise we will have to
> -     guarantee that all pointer-to incomplete type variants
> -     will be replaced by pointer-to complete type variants if
> -     they are available.
> -
> -     With LTO the convenient situation of using `void *' to
> -     access and store any pointer type will also become
> -     more apparent (and `void *' is just another pointer-to
> -     incomplete type).  Assigning alias-set zero to `void *'
> -     and all pointer-to incomplete types is a not appealing
> -     solution.  Assigning an effective alias-set zero only
> -     affecting pointers might be - by recording proper subset
> -     relationships of all pointer alias-sets.
> -
> -     Pointer-to function types are another grey area which
> -     needs caution.  Globbing them all into one alias-set
> -     or the above effective zero set would work.
> -
> -     For now just assign the same alias-set to all pointers.
> -     That's simple and avoids all the above problems.  */
> -  else if (POINTER_TYPE_P (t)
> -	   && t != ptr_type_node)
> +     For this reason go to canonical type of the unqalified pointer type.
> +     Until GCC 6 this code set all pointers sets to have alias set of
> +     ptr_type_node but that is a bad idea, because it prevents disabiguations
> +     in between pointers.  For Firefox this accounts about 20% of all
> +     disambiguations in the program.  */
> +  else if (POINTER_TYPE_P (t) && t != ptr_type_node && !in_lto_p)
> +    {
> +      tree p;
> +      auto_vec <bool, 8> reference;
> +
> +      /* Unnest all pointers and references.
> +         We also want to make pointer to array equivalent to pointer to its
> +         element. So skip all array types, too.  */
> +      for (p = t; POINTER_TYPE_P (p)
> +	   || (TREE_CODE (p) == ARRAY_TYPE && !TYPE_NONALIASED_COMPONENT (p));

As we glob vector<int> to int we should look through VECTOR_TYPE as well
here.

> +	   p = TREE_TYPE (p))
> +	{
> +	  if (TREE_CODE (p) == REFERENCE_TYPE)
> +	    reference.safe_push (true);
> +	  if (TREE_CODE (p) == POINTER_TYPE)
> +	    reference.safe_push (false);
> +	}
> +      p = TYPE_MAIN_VARIANT (p);
> +
> +      /* Make void * compatible with char * and also void **.
> +	 Programs are commonly violating TBAA by this.
> +
> +	 We also make void * to conflict with every pointer
> +	 (see record_component_aliases) and thus it is safe it to use it for
> +	 pointers to types with TYPE_STRUCTURAL_EQUALITY_P.  */
> +      if (TREE_CODE (p) == VOID_TYPE || TYPE_STRUCTURAL_EQUALITY_P (p))

This should also conver incomplete types (and luckily void is incomplete).
I'm quite sure TYPE_STRUCTURAL_EQUALITY_P doesn't cover all incomplete
types.

> +	set = get_alias_set (ptr_type_node);
> +      else
> +	{
> +	  /* Rebuild pointer type from starting from canonical types using
> +	     unqualified pointers and references only.  This way all such
> +	     pointers will have the same alias set and will conflict with
> +	     each other.
> +
> +	     Most of time we already have pointers or references of a given type.
> +	     If not build new one just to be sure that if someone later (probably
> +	     only middle-end can, as we should assign all alias classes only after
> +	     finishing translation unit) builds the pointer type, the canonical type
> +	     will match.  */
> +	  p = TYPE_CANONICAL (p);
> +	  while (!reference.is_empty ())
> +	    {
> +	      if (reference.pop ())
> +		p = build_reference_type (p);
> +	      else
> +		p = build_pointer_type (p);
> +	      p = TYPE_CANONICAL (TYPE_MAIN_VARIANT (p));

err - the pointer you build should be a main variant already, and
canonical as well.  So the assert belongs here ;)

Btw, you should definitely glob POINTER_TYPE and REFERENCE_TYPE.

class T;
void foo (T **);

void bar(T *p)
{
  T*&r = p;
  foo (&r)
}

compiles fine (so foo can store T* to r).

Please verify that your code globs at least those pointer types
the former c-family get_alias_set langhook globbed.

> +	    }
> +          gcc_checking_assert (TYPE_CANONICAL (p) == p);
> +
> +	  /* Assign the alias set to both p and t.
> +	     We can not call get_alias_set (p) here as that would triger
> +	     infinite recursion when p == t.
> +	     In other cases it would just trigger unnecesary legwork of
> +	     rebuilding the pointer again.  */
> +	  if (TYPE_ALIAS_SET_KNOWN_P (p))
> +	    /* Return early; we do not need to record component aliases.  */
> +	    return TYPE_ALIAS_SET (t) = TYPE_ALIAS_SET (p);

Well, see below.  I don't like that coding-style btw, so just do

         set = TYPE_ALIAS_SET (p);

> +	  else
> +	    {
> +	      set = new_alias_set ();
> +	      TYPE_ALIAS_SET (p) = set;
> +	    }
> +	}
> +    }
> +  /* In LTO the rules above needs to be part of canonical type machinery.
> +     For now just punt.  */

I see no reason for punting for LTO here.

Btw, please check if SPEC perl still works without -fno-strict-aliasing
(it finally did after the change to do pointer globbing).

> +  else if (POINTER_TYPE_P (t) && t != ptr_type_node && in_lto_p)
>      set = get_alias_set (ptr_type_node);
>  
>    /* Otherwise make a new alias set for this type.  */
> @@ -950,7 +997,8 @@ get_alias_set (tree t)
>  
>    /* If this is an aggregate type or a complex type, we must record any
>       component aliasing information.  */
> -  if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> +  if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE
> +      || TREE_CODE (t) == POINTER_TYPE)
>      record_component_aliases (t);

Err...
>  
>    return set;
> @@ -1050,6 +1098,26 @@ record_component_aliases (tree type)
>  
>    switch (TREE_CODE (type))
>      {
> +    /* We want void * to be compatible with any other pointer without really
> +       dropping it to alias set 0. Doing so would make it compatible with
> +       all non-pointer types too.
> +
> +       Make thus ptr_type_node to be a component of every pointer type.
> +       Tus memory operations of type "void *" conflict with operations of type
> +       "T *" without impossing a conflict with memory operations of type "Q *"
> +       in case T and Q do not conflict.
> + 
> +       This is not strictly necessary by the language standards, but avoids
> +       common type punning mistakes.  In addition to that, we need the existence
> +       of such universal pointer to implement Fortran's C_PTR type (which is
> +       defined as type compatible with all C pointers) and we use it in
> +       get_alias_set to give alias set to pointers to types without
> +       canonical types (those are typically nameless incomplete types)
> +       that needs to be also compatible with each other and with pointers to
> +       complete types.  */
> +    case POINTER_TYPE:
> +      record_alias_subset (superset, get_alias_set (ptr_type_node));
> +      break;

... I'd rather add this to the pointer handling in get_alias_set directly.
It's not really components.  Also see my comment about symmetry.

>      case RECORD_TYPE:
>      case UNION_TYPE:
>      case QUAL_UNION_TYPE:
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)



More information about the Gcc-patches mailing list