This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Register Pressure guided Unroll and Jam in GCC !!


On Mon, 2014-06-16 at 14:42 -0400, Vladimir Makarov wrote:
> On 2014-06-16, 2:25 PM, Aaron Sawdey wrote:
> > On Mon, 2014-06-16 at 14:14 +0000, Ajit Kumar Agarwal wrote:
> >> Hello All:
> >>
> >> I have worked on the Open64 compiler where the Register Pressure Guided Unroll and Jam gave a good amount of performance improvement for the  C and C++ Spec Benchmark and also Fortran benchmarks.
> >>
> >> The Unroll and Jam increases the register pressure in the Unrolled Loop leading to increase in the Spill and Fetch degrading the performance of the Unrolled Loop. The Performance of Cache locality achieved through Unroll and Jam is degraded with the presence of Spilling instruction due to increases in register pressure Its better to do the decision  of Unrolled Factor of the Loop based on the Performance model of the register pressure.
> >>
> >> Most of the Loop Optimization Like Unroll and Jam is implemented in the High Level IR. The register pressure based Unroll and Jam requires the calculation of register pressure in the High Level IR  which will be similar to register pressure we calculate on Register Allocation. This makes the implementation complex.
> >>
> >> To overcome this, the Open64 compiler does the decision of Unrolling to both High Level IR and also at the Code Generation Level. Some of the decisions way at the end of the Code Generation . The advantage of using this approach like Open64 helps in using the register pressure information calculated by the Register Allocator. This helps the implementation much simpler and less complex.
> >>
> >> Can we have this approach in GCC of the Decisions of Unroll and Jam in the High Level IR  and also to defer some of the decision at the Code Generation Level like Open64?
> >>
> >>   Please let me know what do you think.
> >
> > I have been working on calculating something analogous to register
> > pressure using a count of the number of live SSA values during the
> > ipa-inline pass. I've been working on steering inlining (especially in
> > LTO) away from decisions that explode the register pressure downstream,
> > with a similar goal of avoiding situations that cause a lot of spill
> > code.
> >
> > I have been working in a branch if you want to take a look:
> > gcc/branches/lto-pressure
> >
> 
> Any pressure evaluation is a better than its absence.  But on this level 
> it is hard to evaluate it accurately.
> 
> E.g. pressure in loop can be high for general regs, for fp regs or the 
> both.  Using live SSA values is still very inaccurate to make a right 
> decision for the transformations.
> 

Yes, the jump I have not made yet is to classify the pressure by what
register class it might end up in. The other big piece that's
potentially missing at that point is pressure caused by temps and by
scheduling. But I think you can still get order-of-magnitude type
estimates.

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]