[PATCH][RFC] Restrict, take 42

Richard Biener rguenther@suse.de
Wed Sep 3 13:00:00 GMT 2014


Ok, so with recent activity in that mgrid bug (PR55334) I tried
to remember what solution we thought of after determining that
ADD_RESTRICT is a no-go.

The following very prototypish patch implements the idea of
computing known non-dependences and maintaining them over
the compilation (mainly inlining / cloning for PR55334).

So the patch piggy-backs on PTA (bad - PTA doesn't compute
must-aliases, so it will work only for a very limited set
of testcases now - but at least it might work for
non-register restrict stuff).

The representation of "non-dependences" is the most interesting
bit of course.  We partition memory references into different
cliques in which all references are independent.  And we
split that clique again into sets of references based on the
same pointer.

That allows us to disambiguate equal clique but distinct pointer
memory references.

For restrict you'd put all(!) memory references in the BLOCK
where the restrict pointer is live into the same clique
and assign a distinct ptr based on which restrict pointer this
is based on.  You make all references not based on any
restrict pointer have ptr == 0 (not implemented in the prototype
patch - they don't get a clique assigned).

The patch simplifies this by taking function scope as the
only BLOCK to consider, thus inside a function the clique
will be unique (before inlining).

I can see issues arising with assigning numbers in the frontend
based on real BLOCKs and
  
  {
    int * restrict q;
    {
      int * restrict p;
      *p = *q;
    }
    {
      int * restrict r;
      *r = *q;
    }

that is, non-dependences based on pointers coming from different
nested scopes.  The FE in this case would need to "duplicate"
the ptr value for 'q' to not make *p and *r falsely a
non-dependence.  But I think we're not planning to assign
this in the C frontend family (maybe the Fortran FE though).

To preserve correctness cliques from inlined functions need to
be remapped to an unused clique number so struct function
gets a max_clique number.  (remapping not implemented in
the prototype)

Any correctness / missed-optimization holes to punch?  That is,
given a set of non-dependent reference pairs, can you assign
that clique, ptr values in a correct and optimal way?
(it somehow assumes transitivity, if the set is *a, *b; *b, *c;
then *a, *c are also non-dependent)

It all depends on a conservative but agressive must-be-based-on
analysis of course.  You'd meet that with the conservative
may-be-based-on analysis from PTA and compute the proper
non-dependences from that.

Comments welcome.

Patch tested on

static inline void foo (int * __restrict__ p, int * __restrict__ q)
{
  int i;
  for (i = 0; i < 1024; ++i)
    p[i] = q[i];
}

void bar (int *p, int *q)
{
  foo (p, q);
}

where it still vectorizes the loop in bar without alias check.

Richard.

Index: trunk/gcc/function.h
===================================================================
*** trunk.orig/gcc/function.h	2014-08-29 14:18:27.221364973 +0200
--- trunk/gcc/function.h	2014-09-03 13:15:11.433883630 +0200
*************** struct GTY(()) function {
*** 592,597 ****
--- 592,600 ----
       a string describing the reason for failure.  */
    const char * GTY((skip)) cannot_be_copied_reason;
  
+   /* Last assigned restrict UID.  */
+   unsigned short last_restrict_uid;
+ 
    /* Collected bit flags.  */
  
    /* Number of units of general registers that need saving in stdarg
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2014-08-26 13:40:18.811368135 +0200
--- trunk/gcc/tree-inline.c	2014-09-03 14:15:14.792635543 +0200
*************** remap_gimple_op_r (tree *tp, int *walk_s
*** 929,934 ****
--- 929,937 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (old->base.u.version != 0)
+ 	    (*tp)->base.u.version = old->base.u.version;
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
*************** copy_tree_body_r (tree *tp, int *walk_su
*** 1180,1185 ****
--- 1183,1191 ----
  	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
  	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
  	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
+ 	  /* ???  Properly remap the clique to a new one.  */
+ 	  if (old->base.u.version != 0)
+ 	    (*tp)->base.u.version = old->base.u.version;
  	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
  	     remapped a parameter as the property might be valid only
  	     for the parameter itself.  */
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2014-08-04 11:50:13.421690694 +0200
--- trunk/gcc/tree-pretty-print.c	2014-09-03 14:04:41.816679122 +0200
*************** dump_generic_node (pretty_printer *buffe
*** 1071,1077 ****
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))))
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
--- 1071,1078 ----
  	    /* Same value types ignoring qualifiers.  */
  	    && (TYPE_MAIN_VARIANT (TREE_TYPE (node))
  		== TYPE_MAIN_VARIANT
! 		    (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))
! 	    && node->base.u.version == 0)
  	  {
  	    if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR)
  	      {
*************** dump_generic_node (pretty_printer *buffe
*** 1102,1107 ****
--- 1103,1115 ----
  		dump_generic_node (buffer, TREE_OPERAND (node, 1),
  				   spc, flags, false);
  	      }
+ 	    if (node->base.u.version != 0)
+ 	      {
+ 		pp_string (buffer, " clq ");
+ 		pp_unsigned_wide_integer (buffer, node->base.u.version >> 16);
+ 		pp_string (buffer, " ptr ");
+ 		pp_unsigned_wide_integer (buffer, node->base.u.version & 0xffff);
+ 	      }
  	    pp_right_bracket (buffer);
  	  }
  	break;
*************** dump_generic_node (pretty_printer *buffe
*** 1440,1446 ****
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))))))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
--- 1448,1455 ----
  		  /* Same value types ignoring qualifiers.  */
  		  && (TYPE_MAIN_VARIANT (TREE_TYPE (op0))
  		      == TYPE_MAIN_VARIANT
! 		          (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))
! 		  && op0->base.u.version == 0)))
  	{
  	  op0 = TREE_OPERAND (op0, 0);
  	  str = "->";
Index: trunk/gcc/tree-ssa-alias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-alias.c	2014-08-26 13:40:16.710368279 +0200
--- trunk/gcc/tree-ssa-alias.c	2014-09-03 14:25:17.324594059 +0200
*************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref
*** 1371,1376 ****
--- 1371,1406 ----
      return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1,
  				  ref2->ref, base2, offset2, max_size2);
  
+   /* Handle restrict based accesses.
+      ???  ao_ref_base strips inner MEM_REF [&decl], recover from that
+      here.  */
+   tree rbase1 = base1;
+   tree rbase2 = base2;
+   if (var1_p)
+     {
+       rbase1 = ref1->ref;
+       if (rbase1)
+ 	while (handled_component_p (rbase1))
+ 	  rbase1 = TREE_OPERAND (rbase1, 0);
+     }
+   if (var2_p)
+     {
+       rbase2 = ref2->ref;
+       if (rbase2)
+ 	while (handled_component_p (rbase2))
+ 	  rbase2 = TREE_OPERAND (rbase2, 0);
+     }
+   if (rbase1 && rbase2
+       && TREE_CODE (base1) == MEM_REF
+       && TREE_CODE (base2) == MEM_REF
+       && base1->base.u.version != 0
+       && base2->base.u.version != 0
+       /* If the accesses are in the same restrict clique... */
+       && (base1->base.u.version >> 16) == (base2->base.u.version >> 16)
+       /* But based on different pointers they do not alias.  */
+       && (base1->base.u.version & 0xffff) != (base2->base.u.version & 0xffff))
+     return false;
+ 
    ind1_p = (TREE_CODE (base1) == MEM_REF
  	    || TREE_CODE (base1) == TARGET_MEM_REF);
    ind2_p = (TREE_CODE (base2) == MEM_REF
Index: trunk/gcc/tree-ssa-structalias.c
===================================================================
*** trunk.orig/gcc/tree-ssa-structalias.c	2014-09-02 10:13:58.150580778 +0200
--- trunk/gcc/tree-ssa-structalias.c	2014-09-03 14:23:10.540602788 +0200
***************
*** 52,57 ****
--- 52,59 ----
  #include "splay-tree.h"
  #include "params.h"
  #include "alias.h"
+ #include "tree-phinodes.h"
+ #include "ssa-iterators.h"
  
  /* The idea behind this analyzer is to generate set constraints from the
     program, then solve the resulting constraints in order to generate the
*************** struct variable_info
*** 272,283 ****
--- 274,292 ----
    /* True if this field has only restrict qualified pointers.  */
    unsigned int only_restrict_pointers : 1;
  
+   /* True if this represents a heap var created for a restrict qualified
+      pointer.  */
+   unsigned int is_restrict_var : 1;
+ 
    /* True if this represents a global variable.  */
    unsigned int is_global_var : 1;
  
    /* True if this represents a IPA function info.  */
    unsigned int is_fn_info : 1;
  
+   /* ???  Store somewhere better.  */
+   unsigned short ruid;
+ 
    /* The ID of the variable for the next field in this structure
       or zero for the last field in this structure.  */
    unsigned next;
*************** new_var_info (tree t, const char *name)
*** 369,374 ****
--- 378,384 ----
    ret->is_heap_var = false;
    ret->may_have_pointers = true;
    ret->only_restrict_pointers = false;
+   ret->is_restrict_var = false;
    ret->is_global_var = (t == NULL_TREE);
    ret->is_fn_info = false;
    if (t && DECL_P (t))
*************** static varinfo_t
*** 3773,3778 ****
--- 3783,3789 ----
  make_constraint_from_restrict (varinfo_t lhs, const char *name)
  {
    varinfo_t vi = make_heapvar (name);
+   vi->is_restrict_var = 1;
    vi->is_global_var = 1;
    vi->may_have_pointers = 1;
    make_constraint_from (lhs, vi->id);
*************** compute_may_aliases (void)
*** 6948,6953 ****
--- 6959,7046 ----
    if (dump_file)
      dump_alias_info (dump_file);
  
+     {
+       unsigned short clique = 0;
+       unsigned short last_ruid = 0;
+       for (unsigned i = 0; i < num_ssa_names; ++i)
+ 	{
+ 	  tree ptr = ssa_name (i);
+ 	  if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr)))
+ 	    continue;
+ 
+ 	  /* Avoid all this when ptr is not dereferenced?  */
+ 	  tree p = ptr;
+ 	  if (SSA_NAME_IS_DEFAULT_DEF (ptr)
+ 	      && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
+ 		  || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
+ 	    p = SSA_NAME_VAR (ptr);
+ 
+ 	  varinfo_t vi = get_varinfo (find (lookup_vi_for_tree (p)->id));
+ 	  bitmap_iterator bi;
+ 	  unsigned j;
+ 	  varinfo_t restrict_var = NULL;
+ 	  EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi)
+ 	    {
+ 	      varinfo_t oi = get_varinfo (j);
+ 	      if (oi->is_restrict_var)
+ 		{
+ 		  if (restrict_var)
+ 		    {
+ 		      restrict_var = NULL;
+ 		      break;
+ 		    }
+ 		  restrict_var = oi;
+ 		}
+ 	      /* NULL is the only other valid points-to entry.  */
+ 	      else if (oi->id != nothing_id)
+ 		{
+ 		  restrict_var = NULL;
+ 		  break;
+ 		}
+ 	    }
+ 	  /* Ok, found that ptr must(!) point to a single(!) restrict
+ 	     variable.  */
+ 	  /* ???  PTA isn't really a proper propagation engine to compute
+ 	     this property.
+ 	     ???  We can handle merging of two restricts
+ 	     by unifying them.  */
+ 	  if (restrict_var)
+ 	    {
+ 	      if (clique == 0)
+ 		clique = ++cfun->last_restrict_uid;
+ 	      if (restrict_var->ruid == 0)
+ 		restrict_var->ruid = ++last_ruid;
+ 
+ 	      /* Now look at possible dereferences of ptr.  */
+ 	      imm_use_iterator ui;
+ 	      gimple use_stmt;
+ 	      FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr)
+ 		{
+ 		  /* ???  Calls and asms.  */
+ 		  if (!gimple_assign_single_p (use_stmt))
+ 		    continue;
+ 		  tree *ref = gimple_assign_lhs_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    (*ref)->base.u.version = clique << 16 | restrict_var->ruid;
+ 		  ref = gimple_assign_rhs1_ptr (use_stmt);
+ 		  while (handled_component_p (*ref))
+ 		    ref = &TREE_OPERAND (*ref, 0);
+ 		  if (TREE_CODE (*ref) == MEM_REF
+ 		      && TREE_OPERAND (*ref, 0) == ptr)
+ 		    (*ref)->base.u.version = clique << 16 | restrict_var->ruid;
+ 		}
+ 
+ 	      /* ???  Accesses not based on a restrict pointer should
+ 	         all get into the clique but share a single ptr ID.
+ 		 That way they get disabiguated against restrict
+ 		 accesses but not against each other.  */
+ 	    }
+ 	}
+     }
+ 
    /* Deallocate memory used by aliasing data structures and the internal
       points-to solution.  */
    delete_points_to_sets ();
Index: trunk/gcc/tree-data-ref.c
===================================================================
*** trunk.orig/gcc/tree-data-ref.c	2014-08-14 14:38:53.735508565 +0200
--- trunk/gcc/tree-data-ref.c	2014-09-03 14:55:51.995467744 +0200
*************** dr_analyze_indices (struct data_referenc
*** 979,987 ****
--- 979,989 ----
  	     guaranteed.
  	     As a band-aid, mark the access so we can special-case
  	     it in dr_may_alias_p.  */
+ 	  tree old = ref;
  	  ref = fold_build2_loc (EXPR_LOCATION (ref),
  				 MEM_REF, TREE_TYPE (ref),
  				 base, memoff);
+ 	  ref->base.u.version = old->base.u.version;
  	  access_fns.safe_push (access_fn);
  	}
      }
*************** dr_may_alias_p (const struct data_refere
*** 1388,1393 ****
--- 1390,1403 ----
  	return false;
      }
  
+   if (TREE_CODE (addr_a) == MEM_REF
+       && TREE_CODE (addr_b) == MEM_REF
+       && addr_a->base.u.version != 0
+       && addr_b->base.u.version != 0
+       && (addr_a->base.u.version >> 16) == (addr_b->base.u.version >> 16)
+       && (addr_a->base.u.version & 0xffff) != (addr_b->base.u.version & 0xffff))
+     return false;
+ 
    /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we
       do not know the size of the base-object.  So we cannot do any
       offset/overlap based analysis but have to rely on points-to



More information about the Gcc-patches mailing list