On-Demand range technology [1/5] - Executive Summary

Thu May 23 01:28:00 GMT 2019

Now that stage 1 has reopened, Iâ€™d like to reopen a discussion about the 
technology and experiences we have from the Ranger project I brought up 
last year. https://gcc.gnu.org/ml/gcc/2018-05/msg00288.html .Â  (The 
original wiki pages are now out of date, and I will work on updating 
them soon.)

The Ranger is designed to evaluate ranges on-demand rather than through 
a top-down approach. This means you can ask for a range from anywhere, 
and it walks back thru the IL satisfying any preconditions and doing the 
required calculations.Â  It utilizes a cache to avoid re-doing work. If 
ranges are processed in a forward dominator order, itâ€™s not much 
different than what we do today. Due to its nature, the order you 
process things in has minimal impact on the overall timeâ€¦ You can do it 
in reverse dominator order and get similar times.

It requires no outside preconditions (such as dominators) to work, and 
has a very simple APIâ€¦ Simply query the range of an ssa_name at any 
point in the IL and all the details are taken care of.

We have spent much of the past 6 months refining the prototype (branch 
â€œssa-rangeâ€) and adjusting it to share as much code with VRP as 
possible. They are currently using a common code base for extracting 
ranges from statements, as well as simplifying statements.

The Ranger deals with justÂ  ranges. The other aspects ofÂ  VRP are 
intended to be follow on work that integrates tightly with it, but are 
also independent and would be available for other passes to use.Â  These 
include:
- Equivalency tracking
- Relational processing
- Bitmask tracking

We have implemented a VRP pass that duplicates the functionality of EVRP 
(other than the bits mentioned above), as well as converted a few other 
passes to use the interface.. I do not anticipate those missing bits 
having a significant impact on the results.

The prototype branch it quite stable and can successfully build and test 
an entire Fedora distribution (9174 packages). There is an issue with 
switches I will discuss later whereby the constant range of a switch 
edge is not readily available and is exponentially expensive to 
calculate. We have a design to address that problem, and in the common 
case we are about 20% faster than EVRP is.

When utilized in passes which only require ranges for a small number of 
ssa-names we see significant improvements.Â  The sprintf warning pass for 
instance allows us to remove the calculations of dominators and the 
resulting forced walk order. We see a 95% speedup (yes, 1/20th of the 
overall time!).Â  This is primarily due to no additional overhead and 
only calculating the few things that are actually needed.Â  The walloca 
and wrestrict passes are a similar model, but as they have not been 
converted to use EVRP ranges yet, we donâ€™t see similar speedups there.

That is the executive summary.Â  I will go into more details of each 
major thing mentioned in follow on notes so that comments and 
discussions can focus on one thing at a time.

We think this approach is very solid and has many significant benefits 
to GCC. Weâ€™d like to address any concerns you may have, and work towards 
finding a way to integrate this model with the code base during this 
stage 1.

Comments and feedback always welcome!
Thanks
Andrew