On-Demand range technology [3/5] - The Prototype

Wed May 29 13:11:00 GMT 2019

On Tue, May 28, 2019 at 4:41 PM Jeff Law <law@redhat.com> wrote:
>
> On 5/23/19 7:10 AM, Richard Biener wrote:
> > On Thu, May 23, 2019 at 3:29 AM Andrew MacLeod <amacleod@redhat.com> wrote:
> >>
> >> There is a functioning prototype in branch “ssa-range” which is a proof
> >> of concept that the approach is functional as well as quick, and can be
> >> used to answer questions which come up regarding what it can and can’t
> >> do.  Our last merge was on April 13th, so it's fairly up to date.
> >>
> >> We have implemented a flexible range class (irange) which allows for
> >> multiple subranges to represent a range, and which can be extended in
> >> the future to types other than integral. We use this throughout, but it
> >> could be replaced in the ranger with any similar API. Conversion
> >> routines are also provided to convert from irange to value_range and
> >> value_range to irange.
> >>
> >> A full set of tree_code range-op routines are implemented.  We have
> >> commoned as much code as possible with the existing VRP range extraction
> >> code.  Also, we have added additional code for calculating the other
> >> operands from a known result in numerous cases.
> >>
> >> The code base in VRP has been modified (via a flag) to
> >>      - Work completely with the native value_range like it does today.
> >>      - Use irange and the range-ops component under the covers to
> >> extract ranges. Requests in VRP are then converted from value_ranges to
> >> irange, called into the range-op routines, and then converted back to
> >> value_range for VRP/EVRP’s use.
> >>      - Do operations both ways and compare the results to make sure both
> >> agree on the range, and trap if they do not.
> >>
> >> The branch defaults to the compare and trap mode to ensure everything is
> >> working correctly. This has been our correctness model for statement
> >> range extraction and was active during the Fedora package builds. The
> >> only time we disabled it was to do performance runs vs RVRP, and were
> >> looking at both branch and trunk times for EVRP and VRP.
> >>
> >> Of note, we drop all symbolics in ranges to VARYING on everything except
> >> PLUS and MINUS, which we leave as native calculations if there are
> >> symbolics present.  More on symbolics later.
> >>
> >> A VRP like pass called RVRP has been implemented.
> >>      - The vr-values statement simplification code has been factored out
> >> to be range agnostic, meaning that these routines can operate on either
> >> value_range or irange. Thus, we are using a common code base to perform
> >> statement simplification as well.
> >>      - For complete compatibility with EVRP, the RVRP pass builds
> >> dominators and instantiates the SCEV loop information so we have loop
> >> range info available. RVRP does not need this info to run, but would
> >> miss some of the test cases which depend on loop ranges.
> >>      - RVRP is set up to demonstrate it can process the IL in multiple
> >> directions and bootstraps/passes all tests in all directions.
> >>          * Dominator order
> >>          * Post-dominator order
> >>          * BB1 thru BBn
> >>          * BBn thru BB1
> >>          * branch-only mode where only branches at the end of each BB
> >> are examined for folding opportunities
> >>
> >> 4 additional passes have been converted to use the ranger model:
> >>      - sprintf - removed the dominator building/walking
> >>      - warn alloca - replaced calls to get global ranges with calls that
> >> now return context sensitive ranges.
> >>      - warn restrict - just replaced EVRP range calls with ranger calls.
> >>      - backwards threader - enhanced to use contextual range information
> >> to make additional threading decisions.
> >>
> >>
> >> Symbolic Ranges
> >>
> >> One big concern last year expressed was my decision to abolish symbolic
> >> ranges.
> >>
> >> I continue to maintain that there is no need to track the range of x_2
> >> as [y_3 + 5, MAX] for x_2 = y_3 + 5.   All you need to do is look at the
> >> definition of x_2, and the same information is exposed right there in
> >> the IL.  If one requires the symbolic information, the same on-demand
> >> process could lookup that information as well. This in turn, makes the
> >> code for ranges much simpler, easier to maintain, and less likely to
> >> introduce bugs.
> >>
> >> We have found through our prototype that symbolics in ranges are not
> >> nearly as prevalent as one would think.   During the work to share a
> >> common code base with VRP, we found that symbolic ranges are irrelevant
> >> for everything except PLUS_EXPR and MINUS_EXPR. The shared code in our
> >> branch drops symbolics to varying immediately for everything else, and
> >> it has no impact on EVRP, or VRP or any tests we can find anywhere.
> >> Furthermore, we never trapped while comparing ranges generated by VRP
> >> versus generating them with range-ops which drops symbolics to varying.
> >>
> >> We tried modifying VRP such that we don’t even create symbolic
> >> endpoints, but rather revert to VARYING always.  We can find no test
> >> case that fails because a range is not calculated properly due to
> >> resolving these endpoints.
> >>
> >> There are a few that fail due to the symbolic being used to help track
> >> relationals.. Ie
> >>
> >>       x_2 = y_3 + 5
> >>      If (x_2 > y_3)     // can be folded since we know x_2 must be < y_3
> >>
> >> VRP generates a range for x of  [ y_3+5, MAX ] and at various points
> >> uses that to infer a relational or equivalence.   Ie, it becomes easy to
> >> tell that the condition must always be true since the lower bound of the
> >> range is y_3 + 5.
> >>
> >> I argue this is not a range question, but rather a different problem
> >> which VRP has chosen to solve by piggybacking on the range
> >> representation.  This leads to complications/complexity when trying to
> >> evaluate ranges because they must constantly be on the lookout for
> >> symbolics.  This information is then carried around for the life of the
> >> pass, even if it is never used.  It also forces any
> >> relational/equivalency queries to be handled within the context of the
> >> VRP pass.
> >>
> >> This aspect of symbolics would be handled by a relational/equivalence
> >> processing engine that would be follow on work.  Using the same basic
> >> model as ranges, each tree code is taught to understand the relation
> >> between its operands, and then we can answer equivalency and relational
> >> accurately as well.  It would be available for any pass to use
> >> independent of ranges. I will expound upon that a bit in the future
> >> directions section.
> >
> > While I agree that symbolic ranges are a complication and that most
> > cases it currently handles are not "value-range" things I do not agree
> > with the idea that we can simply remove handling them and deal
> > with the fallout in some later point in the future.  Eric may also be
> > able to show you "real" symbolic value-range magic.
> >
> > Note that symbolic ranges are already restricted to PLUS_EXPR
> > and MINUS_EXPR (and NEGATE_EXPR I think).  There are
> > also "symbolic" (non-integer constant) ranges like [&a, &a].
> > I've never seen [&a, &MEM[&a + 4]] but we wouldn't reject those
> > I think.
> >
> > You may have noticed that EVRP does not use symbolic ranges.

Btw, I probably misremembered this part - it is equivalences that
EVRP doesn't use.  equivalences are the "other half" of relations
besides symbolic ranges.

> > As already said I'd like to see VRP go but obstackles are
> > symbolic ranges and jump-threading (with Jeff promising
> > to handle the jump-threading part in the past).
> Right.  FWIW, one of the follow-on items to this work is Aldy's
> improvements to backwards jump threading which utilizes the ranger
> framework -- the primary purpose of that work is to eliminate the need
> for jump threading in VRP.

But without relations it won't catch most of the cases...

Richard.

> jeff