This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Structure aliasing analyzer


On Mon, 2005-05-09 at 16:50 -0400, Diego Novillo wrote:
> On Fri, Apr 29, 2005 at 02:12:08PM -0400, Daniel Berlin wrote:
> 
> > I've got a bunch of testcases to verify that the *results* of the
> > points-to analysis are correct, and i can add tests to verify that the
> > first alias analysis pass produces the results we want instead of it not
> > happening until the 4th or 5th, as it does now.
> > 
> These will be good to have.
> 
> Cool stuff.  We badly need this, but we still have a lot of
> things to do.  Finer alias info will make some of our
> representational problems even worse.  Without going into a pure
> oracle-based system like the RTL optimizers, we will need to move
> into a sort of hybrid that gives us the best of both worlds.
> Clearly, we can't keep exposing everything in the IL like we do
> today.
> 
> Some questions and obvious feedback below.  I'm still not totally
> done wrapping my brain around it all, so some of the questions
> will be about basic algorithmic stuff.
> 
> 
> - Timings for pass_build_pta on things like DLV, MICO, SPEC,
>   TRAMP and cc1-i-files?

SPEC had no time change in building anything (IE it was within the 1 or
2 second fluctuations you normally get in build time of SPEC).

I'll get numbers on the rest.


> 
> - It's a bit disappointing that we don't get better results.  If
>   we are getting good alias information right off the bat, the
>   scalar cleanups, redundancy elimination and such should be
>   getting better.  Any idea what's getting in the way?
Yes.

Without the infrastructure to say what structure fields point to, when
we propagate them through, we actually lose points-to info temporarily.

IE if you have

struct 
{
    int *a;
} c;

p = &c
a = p->a

and we knew what A pointed to exactly
DOM transforms it to

a = p.c

and we don't have points-to info for p.c, even though it's a structure
field pointer (well, the *analyzer* knows, but it hasn't told anything
else yet).

Also, intraprocedural aliasing is more or less a wash in most cases,
because almost everything depends on function arguments.

Disambiguating the p->a vs p->b is possible in some cases without
interprocedural info, and what the infrastructure i'm working on will
let us do.


and

> 
> - Regardless on how good we do on reducing memory footprint and
>   whatnot, I think we will always want to have different levels
>   of alias information.  Perhaps at -O1 we won't want to have
>   points-to information for structure fields.  
>From the analyzer perspective, it doesn't care, they are all just a
bunch of constraints :)  

>   
>   pass_build_pta needs to take that into account.  It's not so
>   much the time we spend in pass_build_pta, but the effects of
>   more detailed alias info on the rest of the optimizers.
> 
> - Can the implementation in its present form totally replace
>   compute_points_to_and_addr_escape? 

Not addr_escape, because i don't bother to compute what escapes calls,
but we could if you wanted to, without it being a burden.

>  If so, let's get rid of it.
>   No point keeping that hack around.  It's fine if you want to do
>   this in more than one stage, but I want to get rid of it for
>   4.1.
> 
> - More importantly, can we also subsume
>   compute_flow_insensitive_aliasing?

No, not yet.
Again, this is something i'm working on, and we will be hopefully be
able to.
It's a bit trickier :)

>   That would seriously rock.
>   This should also be done for 4.1.  I may be able to lend a hand
>   if you're pressed for time.
> 
> 
> 
> > +/* The idea behind this analyzer is to generate set constraints from the
> > +   program, then solve the resulting constraints in order to generate the
> > +   points-to sets. 
> > + [ ... ]
> >
> Add pointers to literature.

Will do.
> 
> > +  /* True if the address of this variable is taken.  Needed for
> > +     rountev-chandra.  */
> >       ^^^^^^^^^^^^^^^
> They get their due credit when we reference their paper.  Let's
> use the generic algorithm name, otherwise.

"Subsumed node compaction"?
:)

> 
> > +/* Variable that represents readonly memory.  */
> >
> s/readonly/read-only/

Fine.

> 
> > +/* Variable that represents integers.  */
> > +static varinfo_t var_integer;
> > +static tree integer_tree;
> > +static unsigned int integer_id;
> > +
> Why are integers important as a class here?

There was some sane reason i used to have, but i can't remember it
now :)


> 
> > +struct constraint_expr 
> > +{
> > +  /* Constraint type.  */
> > +  constraint_expr_type type;
> > +
> > +  /* Variable we are referring to in the constraint.  */
> > +  unsigned int var;
> > +
> > +  /* Offset is in bits */
> > +  unsigned HOST_WIDE_INT offset;
> >
> Offset of what?  Need to be more explicit here.

Will fix.
> 
> > +/* Constraints are made up of two constraint expressions, one LHS, and one
> > +   RHS.  */
> >
> Need to document what the constraint describes.

Will do.

> 
> > +struct constraint
> > +{
> > +  struct constraint_expr lhs;
> > +  struct constraint_expr rhs;
> > +};
> > +
> > +/* List of constraints that we use to build the constraint graph from.  */
> > +
> > +static VEC(constraint_t,gc) *constraints;
> > +static alloc_pool constraint_pool;
> > +
> > +/* An edge in the constraint graph.  We technically have no use for
> > +   the src, since it will always be the same node that we are indexing
> > +   into the pred/succ arrays with, but it's nice for checking
> > +   purposes.  The edges are weighted, with a bit set in weights for
> > +   each edge from src to dest with that weight.  */
> > +
> Document briefly how the weights are used and what they mean.
Will do.

> 
> > +/* The constraint graph is simply a set of adjacency vectors, one per
> > +   variable. succs[x] is the vector of successors for variable x, and preds[x]
> > +   is the vector of predecessors for variable x. 
> > +   IOW, all edges are "forward" edges, which is not like our CFG.  
> > +   So remember that
> > +   preds[x]->src == x, and
> > +   succs[x]->src == x*/
> > +
> Here's where you also document what the whole constraint graph
> and edges mean as a whole.  A code snippet with a couple of
> pointers and pointed-to variables and its corresponding
> constraint graph to illustrate will help.
> 
Will do.

> Particularly useful when following the pseudo-code for the
> constraint solver down below.  Having a good intuitive idea of
> what the graph means is very helpful.

Will do.
(I'll copy from my paper :P)
> 
> > +/* Print out constraint C to FILE.  */
> > +
> > +void
> > +print_constraint (FILE *file, constraint_t c)
> >
> s/print/dump/, for consistency with most of the other IL
> debugging stuff.

okay.

> 
> > +/* Print out constraint C to stdout.  */
> > +
> > +void
> > +debug_constraint (constraint_t c)
> > +{
> > +  print_constraint (stdout, c);
> >
> Better use 'stderr'.
Okay.

> 
> > +void
> > +print_constraints (FILE *file)
> > +{
> > +  int i;
> > +  constraint_t c;
> > +  for (i = 0; VEC_iterate (constraint_t, constraints, i, c); i++)
> > +    print_constraint (file, c);
> > +}
> > +
> > +/* Print out all constraints to stdout.  */
> > +
> > +void
> > +debug_constraints (void)
> > +{
> > +  print_constraints (stdout);
> > +}
> > +
> Likewise.
Okay.

> 
> > +/* Insert constraint C into the list of complex constraints for VAR.  */
> > +
> > +static void
> > +insert_into_complicated (unsigned int var, constraint_t c)
> >
> s/complicated/complex/

Okay.

> 
> > +static void
> > +clear_edges_for_node (constraint_graph_t graph, unsigned int node)
> > +{
> > +  VEC(constraint_edge_t,gc) *succvec = graph->succs[node];
> > +  VEC(constraint_edge_t,gc) *predvec = graph->preds[node];
> > +  constraint_edge_t c;
> > +  int i;
> > +  
> > +  /* Walk the successors, erase the associated preds.  */
> > +  for (i = 0; VEC_iterate (constraint_edge_t, succvec, i, c); i++)
> > +    if (c->dest != node)
> > +      {
> > +	unsigned int place;
> > +	struct constraint_edge lookfor;
> > +	lookfor.src = c->dest;
> > +	lookfor.dest = node;
> > +	place = VEC_lower_bound (constraint_edge_t, graph->preds[c->dest], 
> > +				 &lookfor, constraint_edge_less);
> > +	VEC_ordered_remove (constraint_edge_t, graph->preds[c->dest], place);
> > +      }
> >
> Vertical spacing.

I'm not quite sure what you mean.
It actually is indented properly in the source.
> This fw declaration is not needed.

Removed.

> 
> > +static bool add_graph_edge (constraint_graph_t, struct constraint_edge);
> > +static bitmap get_graph_weights (constraint_graph_t, struct constraint_edge);
> > +
> Can these two be moved here instead of being forward declared?
Yes.

> > +static void
> > +do_da_constraint (constraint_graph_t graph ATTRIBUTE_UNUSED,
> > +		  constraint_t c, bitmap delta)
> > +{
> > +  unsigned int rhs = c->rhs.var;
> > +  unsigned HOST_WIDE_INT offset = c->lhs.offset;
> > +  unsigned int j;
> > +  bitmap_iterator bi;
> > +
> > +  /* For each member j of Delta (Sol(x)), add x to Sol(j)  */
> > +  EXECUTE_IF_SET_IN_BITMAP (delta, 0, j, bi)
> > +    {
> > +      if (type_safe (j, &offset))
> > +	{
> > +	/* *x != NULL && *x != ANYTHING*/
> >
> Spacing.

Again, this must be a tabs issue, unless you are referring to something
else.

I'll tabify the whole file.
> 
> > +/* Compute a topographic order for GRAPH, and store the result in the
> > +   topo_info structure TI.  */
> > +
> > +static void 
> > +compute_topo_order (constraint_graph_t graph,
> > +		    struct topo_info *ti)
> >
> No line break.
Fixed.

> 
> > +/* Perform offline variable substitution.
> > +   
> > +   This is a linear time way of identifying variables that must have
> > +   equivalent points-to sets, including those caused by static cycles,
> > +   and single entry subgraphs, in the constraint graph.
> > +
> > +   The technique is described in "Off-line variable substitution for
> > +   scaling points-to analysis" by Atanas Rountev and Satish Chandra,
> > +   in "ACM SIGPLAN Notices" volume 35, number 5, pages 47-56.  */
> > +
> > +static void
> > +perform_rountev_chandra (constraint_graph_t graph)
> >
> s/perform_rountev_chandra/perform_var_subst/ or something more
> meaningful.  Likewise for all the other instances of
> rountev_chandra.

Done.

> > +  /* Globals are considered to be the equivalent of anything. We can
> > +     do significantly better than this without too much trouble, by    
> > +     saying a global points to anything, so that taking the address of
> > +     a global is fine.  */
> > +#if 0
> > +  if (DECL_P (t) 
> > +      && is_global_var (t)
> > +      && !TREE_READONLY (t))
> > +    {
> > +      cexpr.type = ADDRESSOF;
> > +      cexpr.var = anything_id;
> > +    }
> > +  else
> > +#endif
> >
> Why this #if 0?

Oh, until the close to the end before submitting, i was treating taking
the address of a global as pointing to anything, but i finally finished
implementing a better solution (taking the address of a global is fine,
deref'ing one gives you ANYTHING)


> > +  /* If the variable doesn't have subvars, we may end up needing to
> > +     create fake variables for the fields, and do a whole bunch of work  */
> > +  
> You may want to expand on what "whole bunch of work" means when
> we can't do subvars.

Okay.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]