This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Register Pressure guided Unroll and Jam in GCC !!

On June 16, 2014 6:39:58 PM CEST, Ajit Kumar Agarwal <> wrote:
>-----Original Message-----
>From: Richard Biener [] 
>Sent: Monday, June 16, 2014 7:55 PM
>To: Ajit Kumar Agarwal
>Cc:; Vladimir Makarov; Michael Eager; Vinod Kathail;
>Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
>Subject: Re: Register Pressure guided Unroll and Jam in GCC !!
>On Mon, Jun 16, 2014 at 4:14 PM, Ajit Kumar Agarwal
><> wrote:
>> Hello All:
>> I have worked on the Open64 compiler where the Register Pressure
>Guided Unroll and Jam gave a good amount of performance improvement for
>the  C and C++ Spec Benchmark and also Fortran benchmarks.
>> The Unroll and Jam increases the register pressure in the Unrolled
>Loop leading to increase in the Spill and Fetch degrading the
>performance of the Unrolled Loop. The Performance of Cache locality
>achieved through Unroll and Jam is degraded with the presence of
>Spilling instruction due to increases in register pressure Its better
>to do the decision  of Unrolled Factor of the Loop based on the
>Performance model of the register pressure.
>> Most of the Loop Optimization Like Unroll and Jam is implemented in
>the High Level IR. The register pressure based Unroll and Jam requires
>the calculation of register pressure in the High Level IR  which will
>be similar to register pressure we calculate on Register Allocation.
>This makes the implementation complex.
>> To overcome this, the Open64 compiler does the decision of Unrolling
>to both High Level IR and also at the Code Generation Level. Some of
>the decisions way at the end of the Code Generation . The advantage of
>using this approach like Open64 helps in using the register pressure
>information calculated by the Register Allocator. This helps the
>implementation much simpler and less complex.
>> Can we have this approach in GCC of the Decisions of Unroll and Jam
>in the High Level IR  and also to defer some of the decision at the
>Code Generation Level like Open64?
>>  Please let me know what do you think.
>>>Sure, you can for example compute validity of the transform during
>the GIMPLE loop opts, annotate the loop meta-information with the
>desired transform and apply it (or not) later >>during RTL unrolling.
>Thanks !! Has RTL unrolling been already implemented?

Yes but not of non-innermost loops afaik.


>> Thanks & Regards
>> Ajit

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]