This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Rematerialization and Live Range Splitting on Region Frequency

From: Vladimir Makarov <vmakarov at redhat dot com>
To: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>, "law at redhat dot com" <law at redhat dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Cc: Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
Date: Mon, 26 Jan 2015 14:40:09 -0500
Subject: Re: Rematerialization and Live Range Splitting on Region Frequency
Authentication-results: sourceware.org; auth=none
References: <b7e8c3ea09064f3a85e6e1a492c79d9b at BY2FFO11FD057 dot protection dot gbl>


On 2015-01-25 4:55 AM, Ajit Kumar Agarwal wrote:

Hello All:

Looks like Live range splitting and rematerialization are connected to each other. If the boundary of Live range
Splitting is in the high frequency of the region then the move connected to splitted live ranges are inside the
High frequency region which is the performance bottleneck for many benchmarks.

Live range splitting based on the frequency of the region should be considered. The Live range splitting in the
High frequency region is beneficial if the splitted live range is assigned the color(hard registers) which is better
Spilling inside the high frequency region, although there will be move instruction or shuffle code which is still
Better. If one of the splitted live range does not have any use points and all the partner live ranges gets the
Hard register, then the move instruction due to splitting will be costly for the high frequency region. In such
Case the split point should be move up at the boundary of the transition from low frequency region to high
Frequency region, and the splitted live ranges still get hard registers. This require increment check of
colorabity which increases the compile time but beneficial with respect to run time. The above heuristic should
be incorporated on top of the below Live range splitting Algorithm. Live range splitting algorithm should consider
the blocks in the decreasing order of frequency with the first block should be taken from the high frequency
region and incrementally updates till it become colorable. Thus split points should be at the edge of the transition
from high frequency to low frequency region or from low frequency region to high frequency region.

The above Live range splitting should be incorporated for all the flavors of Graph Coloring.

Regarding the rematerialization the Chaitin's Rematerialization try to recalculate the expression at all the
Use points of the Live ranges and Simpson based approach for Rematerialized try to move the arithmetic
Instruction lower to use points or the basic blocks considering the operands of Arithmetic instructions is
Not touched along the blocks of the Live range.

Considering all the effects of rematerialization, The remat point or the recalculation should be done at the
split points instead of Chaitin's approach of remat at every use points and the Simpson approach of operands
not being touched upon and the movement of arithmetic instruction later at the use points.

The above approaches looks feasible to implement consider the high frequency region into consideration.

Thoughts Please ?

Ajit, nobody can tell you for sure what the final results of yourproposal can be. I usually try a lot of things and most of them arerejected because the results are not what I expected.

I personally implemented Simpson's register pressure decrease throughrematerialization twice. The first time was long ago (as I remember >10 years ago) and that time it worked not bad (as I remember it gave 1%for x86 - you can find the exact data in my GCC summit paper "Fightingregister pressure in GCC"). It worked well because the old RA was quitebad (imho the problem of most research articles in compiler optimizationfield was/is in usage of some sub-par compiler where a new goodoptimization shines in environment of existing simple ones or because ofabsence of many other optimizations).

Second time I reimplemented CFG-sensitive rematerialization (as aseparate pass) last year and the results were worse than without it.Therefore I had to implement a rematerialization subpass in LRA whichreally improves the code. That is because the current RA is prettygood. Even if we have register pressure evaluation (which was absent inthe old RA), I believe it is very inaccurate as IR uses a sophisticatedcoloring criteria which are actually based on dynamic intersectedregister classes (more accurately approximated by dynamic register classinclusion tree). Also it is very hard to predict a lot of decisions inLRA too.

All the above is also true for the live range splitting implementedas a separate pass.

There is a good point in your email that rematerialization shouldwork better when it is done not for each use but for some region of uses(and actually Simpon's approach implements it). I guess if you canimplement this idea in IRA framework and not as a separate pass, itmight give some improvements. The same probably would be true forsplitting in IRA environment. Actually IRA was designed to work fortree of any regions including BB. Currently it is only loops and a lotof was done to minimize # of considered loops as a lot of people werenot happy with RA speed on some tests. The minimization is based onregister pressure evaluation and as I wrote it is not accurate.Including all loops and BB (or may be SESE or other) could improve thecode but make the compiler slower.

On one hand, an engineering approach is to implement all this thingsas separate passes. On the other hand, the best result you can achievewhen you take into account more RA tasks into consideration. It is hardto find a balance between the two approaches.

Follow-Ups:
- RE: Rematerialization and Live Range Splitting on Region Frequency
  - From: Ajit Kumar Agarwal

References:
- Rematerialization and Live Range Splitting on Region Frequency
  - From: Ajit Kumar Agarwal

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]