[PATCH] Add Ranger temporal cache
Wed Nov 4 18:12:28 GMT 2020
PR 97515 highlighted a bit of silliness that results when we calculate a
bunch of ranges by traversing a back edge, and set some values. Then we
eventually visit that block during the DOM walk, and discover the value
can be improved, sometimes dramatically. It is already cached, so
unfortunately we don't visit it again...
The situation is described in comment 4 :
I have created a temporal cache for the ranger that basically adds a
timestamp to the global cache.
The timestamp maintains the time a global range was calculated
(monotonically increasing based on "set") and a list of up to 2
directly dependent ssa-names that whenever we access the global value,
their timestamp is checked to see if they are newer. Any time the
global value for a dependent is newer, then the current global value is
considered stale and the ranger will recalculate using the newer values
of the dependent.
whats that mean?
Using the PR testcase, a back edge calculation for a PHI requires a
range for ui_8:
ui_8 = ~xe_3;
and at this time, we only know that xe_3 is <=0 based on the branch
feeding the statement.. This ui_8 is calculated as [-1, +INF] globally
As the caluclting comtniues, we actually discover that xe_3 has to be -1.
When the EVRp dom walk eventually gets to this statement, we know xe_3
evaluates to [-1, -1] and we fodl this statement to
ui_8 = -1
Unfortunately, the global cache still have [-1, +INF] for ui_8 and has
no way to know to reevalaute.
With the temporal cache operating, when we figure out that xe_3
evaluates to [-1,-1], xe_3 gets a timestamp that is newer than that of ui_8.
when range_of_stmt is called on ui_8 now, it fails the "current" check,
and the ranger proceeds to recalculate ui_8 using the new value of x_3,
and we get the proper result of [-1, -1] store for the global value.
With this patch, this testcase now comes out of EVRP looking like:
e7 (int gg)
<bb 2> :
<bb 3> :
_1 = 0;
_2 = (int) _1;
goto <bb 3>; [INV]
Time wise its pretty good. It basically consumes the time I saved with
the previous cache tweaks, and the overall time now is almost exactly
the same as it was before. So we're still pretty zippy.
This integrates with the previous cache changes so that when this
global value for ui_8 is updated, any changes are automatically
propagated into the global cache as well.
I have also updated the testcase to ensure that it now produces the
above code with a single goto.
This bootstraps on x86_64-pc-linux-gnu, no regressions, and pushed.
PS. Before the next stage 1, I intend to use the preexisting dependency
chains in the GORI engine instead of this one-or-two name timestamp entry.
Currentlt the drawback is that only dependent values are checked, so
intervening calculations will not trigger a recalculation. If we use
the GORI dependency chains, then everything in the dependency chain will
be recognized as stale, and we'll get even more cases. Combined with
improvements planned for how dependency chain ranges are calculated by
GORI, we could get even more interesting results.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 11247 bytes
Desc: not available
More information about the Gcc-patches