This is the mail archive of the
mailing list for the GCC project.
More of a Loop fusion
- From: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>
- To: Jeff Law <law at redhat dot com>, Richard Biener <richard dot guenther at gmail dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Cc: Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, "Nagaraju Mekala" <nmekala at xilinx dot com>
- Date: Sun, 16 Aug 2015 11:21:22 +0000
- Subject: More of a Loop fusion
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=pass (sender IP is 220.127.116.11) smtp.mailfrom=xilinx.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
Loop fusion is an important optimizations that fuses the set of Loops if the following condition is valid.
1) Loops are conformant ( i.e. they have same iteration count).
2. Loops are control equivalent. The control equivalence of the loops can be identified with the dominator and post dominator properties
Of the Loop. Two Loops Lj and LK are control equivalent if the Lj dominates Lk and LK post dominates the Lj.
3. Loops are adjacent. The Loops are adjacent if there is no intervening code between them. The intervening code can be determined
With the dominance properties. If there is an aggregate node ax where the Loop Lj dominates ax and the Loop Lk post dominates
The ax then there is a intervening code.
4. There are forward dependence between the Loops.
If the properties 1. Doesn't hold i.e. the loops are not conformant and they don't have same iteration count. Then the following
Transformation can be done as given in Fig(1) and Fig(2) to make the set of Loops as non-conformant.
In Fig(1) the Loops i1 and i2 are non-conformant as they don't have same iteration count. The following transformation can be made given in Fig(2).
The Fig(2) fuses the Loop for the Loops i1 and i2 that are non-conformant and doesn't have same iteration count.
If the properties 3 is not valid and the set of Loops are not adjacent. The intervening code can be moved up the Loop or down the Loop
So that the Loop become adjacent and there is no intervening code. Also partially the intervening code is moved up and partial intervening
Code is moved down based on the dependence relation and the dominance properties. Such transformation can make loops adjacent and
There is no intervening code.
Also the Loop fusion, should be done after the Loop normalization pass and the Loop normalization pass should be present before the Loop
Fusion pass. The 4 properties to be valid for loop fusion to occur is done with respect to traversing the Loops from outermost to inner most
Loop. First the outer loop is traversed and the loops that are at same level of the outer loop are made into one partitions and that partition
Can be fused if the above four properties is valid and otherwise the loops are taken out from the partitions. The same logic is done for the
Loop traversing outer to inner loops and partitions are formed and later the partitions are fused based on the above properties.
Each Loop is traverse in the forward order of dominance tree and also the reverse order of dominance tree ( i.e. post dominator) and the
Forward dependence is checked. Based on the dependence the Loops are fused. And the Loops that are non-adjacent partial code is moved
Up and partial code is moved down or the whole intervening code is moved up or down.
I think the above transformations and properties and enhance the Loop fusion algorithm in GCC with the above Algorithm to have more Loop
Fusion with respect to SPEC benchmarks.
For ( I1 = 0;I1 < n;I1 ++)
If ( I1 < n-1)
For ( I2 = 0 ; I2 < n-2 ;I2 ++)
For( I1 = 0; i1< n ; i1++)
If(i1 < n-2)
If ( i1 < n-1)
Thanks & Regards