[RFC] Replace VRP with EVRP passes
Andrew MacLeod
amacleod@redhat.com
Thu Oct 14 12:46:35 GMT 2021
On 10/14/21 4:27 AM, Richard Biener wrote:
> On Wed, Oct 13, 2021 at 10:58 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>> As work has progressed, we're pretty close to being able to functionally
>> replace VRP with another EVRP pass. At least it seems close enough that
>> we should discuss if thats something we might want to consider for this
>> release. Replacing just one of the 2 VRP passes is another option.
>>
>> First, lets examine simplifications/folds.
>>
>> Running over my set of 380 GCC source files, we see the following
>> results for number of cases we currently get:
>>
>> Number of EVRP cases : 5789
>> Number of VRP1 cases : 4913
>> Number of VRP2 cases : 279
>> combined VRP1/2: 5192
>>
>> The 2 passes of VRP get a total of 5192 cases.
>>
>> If we run EVRP instead of each VRP pass, we get the following results:
>>
>> Number of EVRP1 cases : 5789
>> Number of EVRP2 cases : 7521
>> Number of EVRP3 cases : 2240
>> combined EVRP2/3: 9761
>>
>> so the EVRP passes find an additional 4569 opportunities, or 88% more.
>> This initially surprised me until it occurred to me that this is
>> probably due to the pruning require for ASSERT_EXPRs, which means it
>> never had a chance to see some of those cases. Notice how the second
>> pass appears far more effective now.
>>
>> Regarding what we would miss if we took VRP out, if we run EVRP passes
>> first then a VRP pass immediately after, we see what VRP finds that EVRP
>> cannot:
>>
>> Number of EVRP2 cases : 7521
>> Number of VRP1 cases : 11
>> Number of EVRP3 cases : 2269
>> Number of VRP2 cases : 54
>>
>> I have looked at some of these, and so far they all appear to be cases
>> which are solved via the iteration model VRP uses.
> Most should be now handled by using SCEV analysis on PHIs rather
> than VRP iteration so can you share an example where the iteration helps?
I didn't delve too deeply since I was skimming for whats missing, and
they all seemed to be in very large programs with cyclic phis feeding
each other.
I'll pick one shortly and do a deeper dive.
>
>> regardless, missing
>> 65 cases and getting 4569 new ones would seem to be a win. I will
>> continue to investigate them.
>>
>> == Performance ==
>>
>> The threading work has been pulled out of VRP, so we get a better idea
>> of what VRPs time really is. we're showing about a 58% slowdown in VRP
>> over the 2 passes. I've begun investigating because it shouldn't be off
>> by that much, Im seeing a lot of excess time being wasted with callback
>> queries from the substitute_and_fold engine when processing PHIs. It
>> should be possible to bring this all back in line as that isnt the model
>> ranger should be using anyway.
>>
>> I figured while I'm looking into the performance side of it, maybe we
>> should start talking about whether we want to replace one or both of the
>> VRP passes with an EVRP instance.
>>
>>
>> I see 3 primary options:
>> 1 - replace both VRP passes with EVRP instances.
>> 2 - replace VRP2 with EVRP2
>> 3 - Replace neither, leave it as is.
>>
>> I figure since the second pass of VRP doesn't get a lot to start with,
>> it probably doesn't make sense to replace VRP1 and not VRP2.
>>
>> Option 1 is what I would expect the strategic move next release to be,
>> it seems ready now, its just a matter of whether we want to give it more
>> time. It would also be trivial to turn VRP back on for one for both
>> later in the cycle if we determines there was something important missing.
>>
>> option 2 is something we ought to really consider if we don't want to do
>> option 1. There are a few PRs that are starting to open that have VRP
>> not getting something due to the whims of more precise mutli-ranges
>> being converted back to a value_range, and replacing VRP2 would allow us
>> to catch those.. plus, we pick up a lot more than VRP2 does.
>>
>> I would propose we add a param, similar to what EVRP has which will
>> allow us to choose which pass is called for VRP1 and VRP2 and set our
>> defaults appropriately. I wouldn't work with a hybrid like we did with
>> EVRP... just choose which pass runs. And we'll have to adjust some
>> testcases based one whatever our default is.
>>
>> Thoughts?
>>
>> Personally I think we give option 1 a go and if something shows up over
>> the next couple of months, or we cant get performance in line with
>> where we want it, then we can switch back to VRP for one or both
>> passes. I wouldn't expect either, but one never knows :-)
>>
>> If that isn't palatable for everyone, then I'd suggest option 2
> How far are you with handling the symbolic range cases VRP gets
> with the relation work? The symbolic range handling is important
> for Ada IIRC. I would of course have hoped the testsuite would
> catch important regressions there (did you test Ada?)
I always test ada, objc.. anything that can be built as part of my
normal build :-) so yes.
As far as I am aware, we are not missing any symbolic/relational cases,
that has all been functioning for a couple of months.
> I think option 2 would be safest at least so any important iteration/symbolic
> work is done early.
>
Certainly works as a good first step.
Andrew
More information about the Gcc-patches
mailing list