This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
1.02 g77 link problem
- To: egcs-bugs at cygnus dot com
- Subject: 1.02 g77 link problem
- From: rowe at excc dot ex dot ac dot uk
- Date: Mon, 6 Apr 1998 20:28:50 +0100
I am working with egcs1.0 and egcs1.02 on some dual processor Pentium
II machines running RedHat 4.2 with kernel (approx) 2.0.30 SMP. I'm
having problems with the *link* stage of 1.02 making a complex*16
program run signifcantly more slowly than 1.00 did. Furthermore, the
same executable runs differently on the machine with 1.02 installed
than on the machine with 1.00. I've tried compiling with one compiler
and linking with the other and furthermore the .s files are identical
apart from the compiler version number comment.
Here is the masterpiece of the programmer's art:
implicit none
integer n, nv1, nv2, i, j, k, l, l1, l2, l0, m0, m, m1, m2
parameter(n = 512, nv1=32, nv2=64)
double complex x(n,n), y(n,n), z(n,n), z1
real t(2)
write(*,*) dtime(t)
do i = 1,n
do j = 1,n
x(i,j) = dcmplx(i+j, i-j)
y(i,j) = dcmplx(i+n-j, i+j)
z(i,j) = (0d0, 0d0)
end do
end do
write(*,*) dtime(t)
do l1 = 1, n, nv2
l2 = min(n, l1+nv2-1)
do m1 = 1, n, nv1
m2 = min(n, m1+nv1-1)
do k = 1, n
do l = l1,l2
z1= (0d0, 0d0)
do m = m1,m2
z1=z1 + x(m,l)*y(m,k)
end do
z(l,k) = z(l,k) + z1
end do
end do
end do
end do
write(*,*) dtime(t)
write(*,*) z(17,3)
write(*,*) dtime(t)
stop
end
I am compiling with:
g77 -O4 -malign-double -fomit-frame-pointer -fforce-addr -funroll-loops -fmove-all-movables -fstrength-reduce -fforce-mem -frerun-cse-after-loop -fexpensive-optimizations -fschedule-insns2 -dr -ds -dj -df zmat.f
Results:
1.00 link/1.00 run 1.00 link/1.02 run
0.00999999978 0.00999999978
0.439999998 1.53999996
36.1500015 37.6200027
(75373568.,152576512.) (75373568.,152576512.)
0.
1.02 link/1.00 run 1.02 link/1.02 run
0.0399999991 0.
0.389999986 0.460000008
50.920002 44.6299973
(75373568.,152576512.) (75373568.,152576512.)
0. 0.
The value in parentheses is the sample result printed out, the others
are the timings. As you can see, we get the same result each time but
the version linked under 1.02 is slower. I get much the same result if
I halve n and and nv2 so I can't think it's a problem with one version
just failing to fit into second level cache level.
I have to say that I find the idea of the same object file running
differently rather worrying..
John