This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[graphite] Loop tiling


Hi Sebastian, Hi Cedric,

since Thursday I think about how to implement loop tiling in graphite.

Lets start with a simple example:

for (i=0;i<=10;i++) {
  for (j=0;j<=20;j++) {
    S1 ;
  }
}

# eq/in i  j  n  1
    1   1  0  0  0   # i >= 0
    1  -1  0  0  10  # i <= 10
    1   0  1  0  0   # j >= 0
    1   0 -1  0  20  # j <= 20

What I would like to generate is:

for (p1=0;p1<=10;p1+=2) {
  for (p2=0;p2<=20;p2+=2) {
    for (i=p1;i<=min(10,p1+1);i++) {
      for (j=p2;j<=min(20,p2+1);j++) {
        S1 ;
      }
    }
  }
}


So there seem to be different ways to make cloog generate this code:

Solution 1: Change the cloog domain matrix.
===========================================	
	What to do:
	
	- Add two more dimensions.
	- Copy the restrictions of i and j to p1 and p2.
	- Add the new restrictions to connect i and j.

The domain for S1:
     # eq/in p1 p2 i  j  n  1
         1   1  0  0  0  0  0   # p1 >= 0
         1  -1  0  0  0  0  10  ï# p1 <= 10
         1   0  1  0  0  0  0   ï# p2 >= 0
         1   0 -1  0  0  0  20  ï# p2 <= 20
         1  -1  0  1  0  0  0   # i >= p1
         1   1  0 -1  0  0  1   # i <= p1 + 1
         1   0 -1  0  1  0  0   # j >= p2
         1   0  1  0 -1  0  1   # j <= p2 + 1
         1   0  0  1  0  0  0  ï # i >= 0  
         1   0  0  0  1  0  0   ï# i <= 10
         1   0  0 -1  0  0  10  ï# j >= 0
         1   0  0  0 -1  0  20  ï# j <= 20

What I get:


for (p1=0;p1<=10;p1++) {
  for (p2=0;p2<=20;p2++) {
    for (i=p1;i<=min(10,p1+1);i++) {
      for (j=p2;j<=min(20,p2+1);j++) {
        S1 ;
      }
    }
  }
}

The problem:

This code is not correct, as we iterate over all values of p1, but we
only should iterate over the even values.
I have no idea how to restrict p1 only to even values. But this seems to
be the only way to change "p1++" to "p1+=2".

Solution 2: Use scattering function:
====================================

To create this code I used the cloog scattering functions and the
original unmodified domain:

for (p1=0;p1<=5;p1++) {
  for (p2=0;p2<=10;p2++) {
    for (i=max(2*p1,0);i<=min(10,2*p1+1);i++) {
      for (j=max(2*p2,0);j<=min(20,2*p2+1);j++) {
        S1 ;
      }
    }
  }
}

Scattering:
     # eq/in p1 p2 p3 p4 p5 p6 p7 p8  i  j  n  1
         1    2  0  0  0  0  0  0  0 -1  0  0  1     # 2 * p1 >= i 
         1   -2  0  0  0  0  0  0  0  1  0  0  0     # 2 * p1 <= i + 1
         1    0  2  0  0  0  0  0  0  0 -1  0  1     # 2 * p2 >= j
         1    0 -2  0  0  0  0  0  0  0  1  0  0     # 2 * pr <= j + 1
         0    0  0  1  0  0  0  0  0  0  0  0  0
         0    0  0  0  1  0  0  0  0  0  0  0  0     
         0    0  0  0  0  1  0  0  0  0  0  0  0    
         0    0  0  0  0  0  1  0  0  0  0  0  0     
         0    0  0  0  0  0  0  1  0  0  0  0  0     
         0    0  0  0  0  0  0  0  1  0  0  0  0    

Domain:
# eq/in i  j  n  1
    1   1  0  0  0   # i >= 0
    1  -1  0  0  10  # i <= 10
    1   0  1  0  0   # j >= 0
    1   0 -1  0  20  # j <= 20

This solution has some advantages:

1. The code is correct
2. It is very easy. Just add two lines to the scattering function.

But some problems:

1. It is not very efficient. As we have many multiplications in the
code.
Will later gcc optimization passes convert these multiplications to
additions?

2. I could not find any documentation about scattering functions, that
describe this way of using the scattering functions.

3. At the moment I have to investigate, if we can mix this approach with
the normal scattering to regenerate the original control flow.


Solution 3. Other ideas??
===============================
Has someone another idea, how to implement loop tiling?

ï
Another graphite specific problem is, how we connect the real loop
variables to the cloog variables. Before loop tiling this was easy.
Loops of the same depth in the cloog output correspond to the loops of
the same depth in the original gimple loop nest. But know we have to add
some data structure, that connects the gimple loops, with the cloog
loops.

Thanks 
Tobi


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]