This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

g77 behaviour


Hi, I use g77 under RH Linux 6.0, mostly for FEM codes. Using the profiler I encountered
a strange behaviour: I have to solve repeatedly two triangular systems, one upper triangular
and the other lower triangular. The matrix is stored in CRS format (the nonzero elements of
each row are stored in sequence and their column indexes too, in another vector) and the
diagonal elements are stored in sequence. Here is the code

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

      subroutine dusolve(neq,aij,ipaij,laij,m,idiag,x)
CC
CC    Compute X:=((D+U)^-1)*X
CC
        dimension laij(*),ipaij(*),idiag(neq)
        double precision aij(*),x(neq),m(neq)
 

        x(neq)=x(neq)*m(neq)

        do 100 i=neq-1,1,-1
        do 20 j=idiag(i),ipaij(i+1)-1
        x(i)=x(i)-aij(j)*x(laij(j))
20      continue
        x(i)=x(i)*m(i)
100     continue

        return
        end
 

      subroutine dlsolve(neq,aij,ipaij,laij,m,idiag,x)
CC
CC      Compute X:=((D+L)^-1)*X
CC
        dimension laij(*),ipaij(*),idiag(neq)
        double precision aij(*),x(neq),m(neq)
 

        x(1)=x(1)*m(1)

        do 100 i=2,neq
        do 20 j=ipaij(i)+1,idiag(i)-1
        x(i)=x(i)-aij(j)*x(laij(j))
20      continue
        x(i)=x(i)*m(i)
100     continue

        return

These two routines are almost identical, but dlsolve takes a cputime which is about three
times than dusolve (Note: the matrix is symmetric), and it doesn't matter which is called
first. After varius experiments I changed dlsolve as follows

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

      subroutine dlsolve(neq,aij,ipaij,laij,m,idiag,x)
CC
CC
CC
        dimension laij(*),ipaij(*),idiag(neq)
        double precision aij(*),x(neq),m(neq)
 

        x(1)=x(1)*m(1)

        do 100 i=2,neq
        k=idiag(i)-ipaij(i)
        j=ipaij(i)
        call mah(x(i),aij(j),k,laij(j),x(laij(j)))
        x(i)=x(i)*m(i)
100     continue

        return
        end

        subroutine mah(x0,a,k,laij,x)
C
C  mah is an expression used in Italy to mean 'It's so, don't ask why'
C
        dimension laij(*)
        double precision a(*),x(*),x0

        l=laij(1)-1
        do 10 j=1,k
        x0=x0-a(j)*x(laij(j)-l)
10      continue

        return
        end
 

now the two routines have almost the same cputime (dlsolve is still slightly slower). The same
trick (calling mah) doesn't work on dusolve. I don't understand why. Is this behaviour
peculiar of g77 or can it be encountered with other compilers ? Is there any document
on f77 (or g77) optimization ?

Best regards.
 

-- 
Giuseppe Borzi'
http://dfmtfa.unime.it/profs/giuseppeborzi.html
Assistant Professor at the Univ. of Messina - Italia
------------------------------------------------------------------------
 
Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]