This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Serious performance regression -- some tree optimizer questions


Ulrich Weigand wrote:

The reason for this appears to be a combination of the new Fortran
front end generating different code, and the new loop optimizers not
quite being able to handle that code.  The hot spot of the test case
is this single loop:

      DO 600 I3=2,N-1
      DO 600 I2=2,N-1
      DO 600 I1=2,N-1
 600  R(I1,I2,I3)=V(I1,I2,I3)
     >      -A(0)*( U(I1,  I2,  I3  ) )
     >      -A(1)*( U(I1-1,I2,  I3  ) + U(I1+1,I2,  I3  )
     >                 +  U(I1,  I2-1,I3  ) + U(I1,  I2+1,I3  )
     >                 +  U(I1,  I2,  I3-1) + U(I1,  I2,  I3+1) )
     >      -A(2)*( U(I1-1,I2-1,I3  ) + U(I1+1,I2-1,I3  )
     >                 +  U(I1-1,I2+1,I3  ) + U(I1+1,I2+1,I3  )
     >                 +  U(I1,  I2-1,I3-1) + U(I1,  I2+1,I3-1)
     >                 +  U(I1,  I2-1,I3+1) + U(I1,  I2+1,I3+1)
     >                 +  U(I1-1,I2,  I3-1) + U(I1-1,I2,  I3+1)
     >                 +  U(I1+1,I2,  I3-1) + U(I1+1,I2,  I3+1) )
     >      -A(3)*( U(I1-1,I2-1,I3-1) + U(I1+1,I2-1,I3-1)
     >                 +  U(I1-1,I2+1,I3-1) + U(I1+1,I2+1,I3-1)
     >                 +  U(I1-1,I2-1,I3+1) + U(I1+1,I2-1,I3+1)
     >                 +  U(I1-1,I2+1,I3+1) + U(I1+1,I2+1,I3+1) )

Any suggestions how to further investigate this?  Is this just
the way things are with the current tree optimizers, or is this
supposed to work and we just need to find the bug?

[ Sorry for my late reply ]


Well, I hope the original authors of the lno-branch know how to code their way out of this hole, because this sort of programming is *very* common in Fortran.

The point is that induction variable analysis has to (and *should* be able to) prove that in the end, only 12 integer/address registers are necessary to code the inner loop, if the target architecture supports register+offset addressing.

- All A's: one register with different offsets.
- R: one register
- V: one register
- All U's: one register for every unique second/third index pair,
  of which there are nine.

--
Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]