This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/42637] [4.5 Regression][graphite] wrong code for -floop-interchange -ftree-loop-distribution



------- Comment #4 from spop at gcc dot gnu dot org  2010-01-15 01:20 -------
The problem here is that the loop invariant motion moves rt(i,j) into
a temporary outside the innermost loop:

   real*8 rt(6,6),r(6,6),rtt(6,6)
      do i=1,6
        do j=1,6
          t = rt(i,j)
          do ia=1,6
            rtt(i,ia)=t*r(j,ia)+rtt(i,ia)
          end do
        end do
      end do

and then we get the cleanup before graphite translating this into an
array:

      do i=1,6
        do j=1,6
          cross_bb[0] = rt(i,j)
          do ia=1,6
            rtt(i,ia)=cross_bb[0]*r(j,ia)+rtt(i,ia)
          end do
        end do
      end do

Then the loop interchange would ask for loop distribution when it
considers the loops 'j' and 'ia', and from the original LST we get:

original_lst (
(root
  0 (loop
    0 (loop
      0 stmt_4
      1 (loop
        0 stmt_5)))))

transformed_lst (
(root
  0 (loop 1
    0 (loop 2
      0 stmt_4)
    1 (loop 3
      0 (loop 4
        0 stmt_5)))))

that is then validated as "legal" by the graphite_legal_transform.

The problem seems to be in the build_lexicographically_gt_constraint
that does not add the information "first instance of stmt_5 is
executed after the last instance of stmt_4 in loop 2".

We would have then a write into cross_bb[0] for all the iterations of
loop 2:

cross_bb[0] = rt(i,0)
cross_bb[0] = rt(i,1)
cross_bb[0] = rt(i,2)
cross_bb[0] = rt(i,3)
cross_bb[0] = rt(i,4)
cross_bb[0] = rt(i,5)

and then only we would read the value of cross_bb[0] in stmt_5:

= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...

In the original program we would have had these writes and reads
interleaved like this:

cross_bb[0] = rt(i,0)
= cross_bb[0] * ...
cross_bb[0] = rt(i,1)
= cross_bb[0] * ...
cross_bb[0] = rt(i,2)
= cross_bb[0] * ...
cross_bb[0] = rt(i,3)
= cross_bb[0] * ...
cross_bb[0] = rt(i,4)
= cross_bb[0] * ...
cross_bb[0] = rt(i,5)
= cross_bb[0] * ...

Konrad could you have a look at build_lexicographically_gt_constraint?

Thanks,
Sebastian


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42637


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]