This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/42637] [4.5 Regression][graphite] wrong code for -floop-interchange -ftree-loop-distribution
- From: "spop at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 Jan 2010 01:20:09 -0000
- Subject: [Bug tree-optimization/42637] [4.5 Regression][graphite] wrong code for -floop-interchange -ftree-loop-distribution
- References: <bug-42637-4503@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #4 from spop at gcc dot gnu dot org 2010-01-15 01:20 -------
The problem here is that the loop invariant motion moves rt(i,j) into
a temporary outside the innermost loop:
real*8 rt(6,6),r(6,6),rtt(6,6)
do i=1,6
do j=1,6
t = rt(i,j)
do ia=1,6
rtt(i,ia)=t*r(j,ia)+rtt(i,ia)
end do
end do
end do
and then we get the cleanup before graphite translating this into an
array:
do i=1,6
do j=1,6
cross_bb[0] = rt(i,j)
do ia=1,6
rtt(i,ia)=cross_bb[0]*r(j,ia)+rtt(i,ia)
end do
end do
end do
Then the loop interchange would ask for loop distribution when it
considers the loops 'j' and 'ia', and from the original LST we get:
original_lst (
(root
0 (loop
0 (loop
0 stmt_4
1 (loop
0 stmt_5)))))
transformed_lst (
(root
0 (loop 1
0 (loop 2
0 stmt_4)
1 (loop 3
0 (loop 4
0 stmt_5)))))
that is then validated as "legal" by the graphite_legal_transform.
The problem seems to be in the build_lexicographically_gt_constraint
that does not add the information "first instance of stmt_5 is
executed after the last instance of stmt_4 in loop 2".
We would have then a write into cross_bb[0] for all the iterations of
loop 2:
cross_bb[0] = rt(i,0)
cross_bb[0] = rt(i,1)
cross_bb[0] = rt(i,2)
cross_bb[0] = rt(i,3)
cross_bb[0] = rt(i,4)
cross_bb[0] = rt(i,5)
and then only we would read the value of cross_bb[0] in stmt_5:
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
= cross_bb[0] * ...
In the original program we would have had these writes and reads
interleaved like this:
cross_bb[0] = rt(i,0)
= cross_bb[0] * ...
cross_bb[0] = rt(i,1)
= cross_bb[0] * ...
cross_bb[0] = rt(i,2)
= cross_bb[0] * ...
cross_bb[0] = rt(i,3)
= cross_bb[0] * ...
cross_bb[0] = rt(i,4)
= cross_bb[0] * ...
cross_bb[0] = rt(i,5)
= cross_bb[0] * ...
Konrad could you have a look at build_lexicographically_gt_constraint?
Thanks,
Sebastian
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42637