This is the mail archive of the
mailing list for the GCC project.
Re: More of a Loop fusion
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>,Jeff Law <law at redhat dot com>,"gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Cc: Vinod Kathail <vinodk at xilinx dot com>,Shail Aditya Gupta <shailadi at xilinx dot com>,Vidhumouli Hunsigida <vidhum at xilinx dot com>,Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Sun, 16 Aug 2015 16:05:46 +0200
- Subject: Re: More of a Loop fusion
- Authentication-results: sourceware.org; auth=none
- References: <37378DC5BCD0EE48BA4B082E0B55DFAA4295B157 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com>
On August 16, 2015 1:21:22 PM GMT+02:00, Ajit Kumar Agarwal <email@example.com> wrote:
>Loop fusion is an important optimizations that fuses the set of Loops
>if the following condition is valid.
GCC does not implement loop fusion at all.
>1) Loops are conformant ( i.e. they have same iteration count).
>2. Loops are control equivalent. The control equivalence of the loops
>can be identified with the dominator and post dominator properties
>Of the Loop. Two Loops Lj and LK are control equivalent if the Lj
>dominates Lk and LK post dominates the Lj.
>3. Loops are adjacent. The Loops are adjacent if there is no
>intervening code between them. The intervening code can be determined
>With the dominance properties. If there is an aggregate node ax where
>the Loop Lj dominates ax and the Loop Lk post dominates
>The ax then there is a intervening code.
>4. There are forward dependence between the Loops.
>If the properties 1. Doesn't hold i.e. the loops are not conformant and
>they don't have same iteration count. Then the following
>Transformation can be done as given in Fig(1) and Fig(2) to make the
>set of Loops as non-conformant.
>In Fig(1) the Loops i1 and i2 are non-conformant as they don't have
>same iteration count. The following transformation can be made given in
>The Fig(2) fuses the Loop for the Loops i1 and i2 that are
>non-conformant and doesn't have same iteration count.
>If the properties 3 is not valid and the set of Loops are not adjacent.
>The intervening code can be moved up the Loop or down the Loop
>So that the Loop become adjacent and there is no intervening code. Also
>partially the intervening code is moved up and partial intervening
>Code is moved down based on the dependence relation and the dominance
>properties. Such transformation can make loops adjacent and
>There is no intervening code.
>Also the Loop fusion, should be done after the Loop normalization pass
>and the Loop normalization pass should be present before the Loop
>Fusion pass. The 4 properties to be valid for loop fusion to occur is
>done with respect to traversing the Loops from outermost to inner most
>Loop. First the outer loop is traversed and the loops that are at same
>level of the outer loop are made into one partitions and that partition
>Can be fused if the above four properties is valid and otherwise the
>loops are taken out from the partitions. The same logic is done for the
>Loop traversing outer to inner loops and partitions are formed and
>later the partitions are fused based on the above properties.
>Each Loop is traverse in the forward order of dominance tree and also
>the reverse order of dominance tree ( i.e. post dominator) and the
>Forward dependence is checked. Based on the dependence the Loops are
>fused. And the Loops that are non-adjacent partial code is moved
>Up and partial code is moved down or the whole intervening code is
>moved up or down.
>I think the above transformations and properties and enhance the Loop
>fusion algorithm in GCC with the above Algorithm to have more Loop
>Fusion with respect to SPEC benchmarks.
>For ( I1 = 0;I1 < n;I1 ++)
> If ( I1 < n-1)
>For ( I2 = 0 ; I2 < n-2 ;I2 ++)
>For( I1 = 0; i1< n ; i1++)
> If(i1 < n-2)
> If(i1< n-1)
> If ( i1 < n-1)
>Thanks & Regards