This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

1.02 g77 link problem


I am working with egcs1.0 and egcs1.02 on some dual processor Pentium
II machines running RedHat 4.2 with kernel (approx) 2.0.30 SMP. I'm
having problems with the *link* stage of 1.02 making a complex*16
program run signifcantly more slowly than 1.00 did. Furthermore, the
same executable runs differently on the machine with 1.02 installed
than on the machine with 1.00.  I've tried compiling with one compiler
and linking with the other and furthermore the .s files are identical
apart from the compiler version number comment.

Here is the masterpiece of the programmer's art:

      implicit none
      integer n, nv1, nv2, i, j, k, l, l1, l2, l0, m0, m, m1, m2
      parameter(n = 512, nv1=32, nv2=64)
      double complex x(n,n), y(n,n), z(n,n), z1
      real t(2)
      
      write(*,*) dtime(t)
      do i = 1,n
        do j = 1,n
          x(i,j) = dcmplx(i+j, i-j)
          y(i,j) = dcmplx(i+n-j, i+j)
          z(i,j) = (0d0, 0d0)
        end do
      end do
      
      write(*,*) dtime(t)
      do l1 = 1, n, nv2
        l2 = min(n, l1+nv2-1)
        do m1 = 1, n, nv1
          m2 = min(n, m1+nv1-1)
          do k = 1, n
            do l = l1,l2
              z1= (0d0, 0d0)
              do m = m1,m2
                z1=z1 + x(m,l)*y(m,k)
              end do
              z(l,k) = z(l,k) + z1
            end do
          end do
        end do
      end do
      write(*,*) dtime(t)
      write(*,*) z(17,3)
      write(*,*) dtime(t)
      stop
      end
      

I am compiling with:
g77 -O4 -malign-double -fomit-frame-pointer -fforce-addr -funroll-loops -fmove-all-movables -fstrength-reduce -fforce-mem -frerun-cse-after-loop -fexpensive-optimizations -fschedule-insns2 -dr -ds -dj -df zmat.f

Results:

1.00 link/1.00 run	1.00 link/1.02 run	

0.00999999978		0.00999999978
0.439999998		1.53999996
36.1500015		37.6200027
(75373568.,152576512.)	(75373568.,152576512.)
0.

1.02 link/1.00 run	1.02 link/1.02 run	

0.0399999991		0.
0.389999986		0.460000008
50.920002		44.6299973
(75373568.,152576512.)	(75373568.,152576512.)
0.			0.


The value in parentheses is the sample result printed out, the others
are the timings. As you can see, we get the same result each time but
the version linked under 1.02 is slower. I get much the same result if
I halve n and and nv2 so I can't think it's a problem with one version
just failing to fit into second level cache level.

I have to say that I find the idea of the same object file running
differently rather worrying..

John


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]