This is the mail archive of the
mailing list for the GCC project.
RE: Live Range Splitting in Integrated Register Allocator
- From: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>
- To: Vladimir Makarov <vmakarov at redhat dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Cc: Michael Eager <eager at eagercon dot com>, Vinod Kathail <vinodk at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Thu, 15 May 2014 15:52:18 +0000
- Subject: RE: Live Range Splitting in Integrated Register Allocator
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=pass (sender IP is 126.96.36.199) smtp dot mailfrom=ajit dot kumar dot agarwal at xilinx dot com;
- References: <0c7a1ace-d040-4c6f-a66a-298b44ea89aa at BN1BFFO11FD057 dot protection dot gbl> <53743A50 dot 1060207 at redhat dot com> <97998782-f72a-4af3-bb4d-840dc32bda03 at BN1BFFO11FD054 dot protection dot gbl> <5374D8A0 dot 2050001 at redhat dot com>
Thanks Vladimir for the clarification.
Thanks & Regards
From: Vladimir Makarov [mailto:email@example.com]
Sent: Thursday, May 15, 2014 8:39 PM
To: Ajit Kumar Agarwal; firstname.lastname@example.org
Cc: Michael Eager; Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Live Range Splitting in Integrated Register Allocator
On 05/15/2014 03:28 AM, Ajit Kumar Agarwal wrote:
> On 2014-05-14, 1:33 PM, Ajit Kumar Agarwal wrote:
>> Hello All:
>> I am planning to implement the Live range splitting based on the following cases in the Integrated Register Allocator.
>> For a given Live range that spans from from outer region to inner region of the loop. Such Live ranges which are LiveIn at the entry of the header of the Loop and Live Out at the exit of the loop but there are no references inside the Loop. Such Live ranges lead to unoptimal spill and fetch inside the Loop conflicting with the shorter live ranges that spans inside the Loop.
>> Lets say such Live range as L1. L1 can be splitted at the Loop Boundary splitting the Live range by making a store at the header of the Loop and the Load at the exit of the Loop. This makes the Live range less conflicting with the Live ranges that are local to the Loop regions reducing the spill and Fetch inside the Loops.
>> From the code and documentation of Integrated Register Allocator following is the understanding.
>> As Live range L1 is live in the outer region but as there are no reference inside the Loop region. Since the allocno for L1 for a given variable v is assigned two allocno v1 and v2 . V1 being assigned allocno for the outer region and v2 as allocno for the inner Loop region. This allows to accumulate the information from the inner loop region to outer region.
>> Will the current Integrated Register Allocator will consider the Live range L1 as Live inside the Loop and outer region? If Yes then there will be conflicting with the Live ranges that are local to the Loop region leading to spill and fetch inside the Loop. If the v1 and v2 allocno are created v1 for the outer region and v2 for the inner region then there will v2 will be conflicting the local live ranges inside the Loop region and v1 will be conflicting with the Live ranges of the outer regions. This is how its been considered as Live range splitting at the Loop Boundary for the Live range that spans inside the Loop but not not being referenced?
>> If Such cases are not being considered in the Integrated Register Allocator, then it will be useful to implement such cases in IRA which will be benefitted the microblaze target.
>> Please let me know what do you think.
>>> Allocno v2 corresponding to live range inside the loop has a very small cost for spilling therefore it will be spilled if we still need registers to pseudos local to the loop. If allocno v1 >>corresponding live ranges outside the loop *and* inside the loop gets a hard register, we will have live range splitting as you propose. So I do not see a necessity for the optimization >>you propose.
> If the allocno v1 does n't get the hard register and become the candidate of spill which is Live that spans the Loop but not reference inside the Loop, it will lead to spills inside the loop which become bottleneck as performance is concerned. Does this case also handled in the IRA? With my proposal the load and store will be moved outside the loop instead of having it inside the Loop for such case.
If v1 does not get a hard register (and v2 most probably too), than there will be no any additional insns because v1 and v2 most probably gets the same memory. If v1 gets a hard register and v2 does not, then we have store on the loop enter and loads on the loop exits (the store/loads are not inside the loop).
>>> Moreover my experience shows that making a lot of explicit transformations (e.g. proposed splitting) even if we have transformations to undo them (e.g. coalescing) results in worse >>code.
>>> The explicit transformations should be as minimal as possible during RA in a good register allocator. So I guess the optimization you propose will actually results in a worse code. >>Although I might be wrong because it is always hard to predict the result of heuristic optimizations.
> There is a paper on Improved Passive Splitting which does take care of Splitting at Loop boundaries for the Live range that spans the Loop but not referenced.
> Improved Passive Splitting (2005) by Keith D. Cooper , Jason Eckhardt. This paper uses the heuristics and claim to reduce the spill and fetch for the case I propose. Also claims to generate the better code.
I read this article long ago (actually I worked with Jason for a few years until he left us to Rice University post-graduated school).
Usually heuristic optimizations (including RA) are very sensitive to compiler environment and overall design. So the effect of algorithm could be quite different in different compiler environment. Another analogous approach is Peter Begner's one. The both is based on classical Chaitin-Briggs RA good for regular file architectures. IRA is different. It is a regional RA (analogous to Callahan-Koblentz). IRA was also designed to deal with irregular register files and as any regional RA deals with splitting on region borders by their original design.
You can read an article
http://vmakarov.fedorapeople.org/vmakarov-submission-cgo2008.pdf which describes IRA design. Although this article is old and does not contain info how IRA handles with irregular files, it is still a good reading to understand IRA.