[PATCH] Optional alternative base_expr in finding basis for CAND_REFs

Wed Nov 13 22:30:00 GMT 2013

Hi Yufeng,

On Wed, 2013-11-13 at 19:32 +0000, Yufeng Zhang wrote:
> Hi Bill,
> 
> On 11/13/13 18:04, Bill Schmidt wrote:
> > Hi Yufeng,
> >
> > On Tue, 2013-11-12 at 22:34 +0000, Yufeng Zhang wrote:
> >> Hi Bill,
> >>
> >> Many thanks for the review.
> >>
> >> I find your suggestion on using the next_interp field quite
> >> enlightening.  I prepared a patch which adds changes without modifying
> >> the framework.  With the patch, the slsr pass now tries to create a
> >> second candidate for each memory accessing gimple statement, and chain
> >> it to the first one via the next_interp field.
> >>
> >> There are two implications in this approach though:
> >>
> >> 1) For each memory accessing gimple statement, there can be two
> >> candidates, and these two candidates can be part of different dependency
> >> graphs respectively (based on different base expr).  Only one of the
> >> dependency graph should be traversed to do replace_refs.  Most of the
> >> changes in the patch is to handle this implication.
> >>
> >> I am aware that you suggest to follow the next-interp chain only when
> >> the searching fails for the first interpretation.  However, that doesn't
> >> work very well, as it can result in worse code-gen.  Taking a varied
> >> form of the added test slsr-41.c for example:
> >>
> >> i1:  a2 [i] [j] = 1;
> >> i2:  a2 [i] [j+1] = 2;
> >> i3:  a2 [i+20] [j] = i;
> >>
> >> With the 2nd interpretation created conditionally, the following two
> >> dependency chains will be established:
> >>
> >>     i1 -->  i2  (base expr is an SSA_NAME defined as (a2 + i * 200))
> >>     i1 -->  i3  (base expr is a tree expression of (a2 + i * 200))
> >
> > So it seems to me that really what needs to happen is to unify those two
> > base_exprs.  We don't currently have logic in this pass to look up an
> > SSA name based on {base, index, stride, cand_type}, but that could be
> > done with a hash table.  For now to save processing time it would make
> > sense to only do that for MEM candidates, though the cand_type should be
> > included in the hash to allow this to be used for other candidate types
> > if necessary.  Of course, the SSA name definition must dominate the
> > candidate to be eligible as a basis, and that should be checked, but
> > this should generally be the case.
> 
> I'm not quite sure if the SSA_NAME look-up works; maybe I haven't fully 
> understood what you suggest.
> 
> For i1 --> i3, the base_expr is the tree expression (a2 + i * 200), 
> which is the result of a sequence of operations (conversion to affine, 
> immediate offset removal and conversion to tree), with another SSA_NAME 
> as the input.  In other words, there are two SSA_NAMEs involved in the 
> example:
> 
>    _s1: (a2 + i * 200).
>    _s2: (a2 + (i * 200 + 4000))
> 
> their strides and indexes are different.
> 
> I guess what you suggest is that given the tree expression (a2 + i * 
> 200), look up an SSA_NAME and return _s1.  If that is the case, the 
> challenge will be how to analyze the tree expression and get the 
> information on its {base, index, stride, cand_type}.  While it would be 
> too specific and narrative to check for a POINTER_PLUS_EXPR expression, 
> the existing framework (e.g. create_add_ssa_cand) seems to assume that 
> the analyzed tree represent a genuine gimple statement.
> 
> Moreover, there may not be an SSA_NAME exists, for example in the 
> following case:
> 
>    i1:  a2 [i+1] [j] = 1;
>    i2:  a2 [i+1] [j+1] = 2;
>    i3:  a2 [i+20] [j] = i;
> 
> you wouldn't be able to find an SSA_NAME for (a2 + i * 200).

Ok.  It is probably too much to hope for to get a sufficiently general
approach to handle all of these cases cleanly.

Bleah.  The whole preferred_ref_cand business seems very ad hoc to me,
and to some extent is closing the barn door after the cows have escaped.
Perhaps we can't use the next-interpretation infrastructure to solve
this problem ideally, in which case I apologize for leading you down
this path.  The alternate patch at least keeps the candidate tree in a
straightforward state, and the new version is less intrusive than the
original.

Let me look that version over more carefully and I'll get back to you.
Thanks for your patience.

Bill

> 
> [snip]
> > A couple of quick comments on the next_interp patch:
> >
> >   * You don't need num_of_dependents ().  You should be able to add a
> > forward declaration for count_candidates () and use it.
> 
> Missed count_candidates (); thanks!
> 
> >   * Your new test case is missing a final newline, so your patch doesn't
> > apply cleanly.
> 
> I'll fix it.
> 
> > Please look into unifying the base expressions, as I believe you should
> > not need the preferred_ref_cand logic if you do that.
> 
> I would also like to live without preferred_ref_cand if feasible . :)
> 
> > I still prefer the approach of using next_interp for its generality and
> > expandibility.
> 
> Sure; this approach indeed fit the framework better.
> 
> 
> Regards,
> Yufeng
>