This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GCC Summit 2010 topic (potentially).


L.S.,

This year I'm unable to attend the GCC Summit (both due to time and money constraints).

In 2008, I pondered to talk about the effect of link time optimization on typical Fortran programs -

That is, until my attention got hijacked by the geo-politically more pressing question of Coarrays in Fortran.

However, the issue still stands. So I'm thinking ahead of next year (assuming LTO will work by that time for most front-end languages):

What will LTO bring for Fortran ?

Here's a run-of-the-mill example from our code:

      SUBROUTINE VERINT (
     I   KLON   , KLAT   , KLEV   , KINT  , KHALO
     I , KLON1  , KLON2  , KLAT1  , KLAT2
     I , KP     , KQ     , KR
     R , PARG   , PRES
     R , PALFH  , PBETH
     R , PALFA  , PBETA  , PGAMA   )
...
      DO JY = KLAT1,KLAT2
      DO JX = KLON1,KLON2
         IDX  = KP(JX,JY)
         IDY  = KQ(JX,JY)
         ILEV = KR(JX,JY)
C
         PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C    +
     +               + PGAMA(JX,JY,2)*(
C    +
     +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
     + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
     +                  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
      ENDDO
      ENDDO
...
      RETURN
      END

There are several issues a link time optimization pass could determine:

1. Whether or not the arrays PALFA, PARG, ... are suitably aligned for
   vectorization (forgoing a run time check for that).

2. Wheter KLON{1,2}, KLAT{1,2} are actually invariant throughout an
   invocation of the execuatable (as they are in our case)
   (CSE of vectorization criteria).

However, with a little bit of extra effort (instrumentation outside the program), the following can be determined:

3. KLON{1,2}, KLAT{1,2} are in fact known constants, which only happen
   to be variables because the executable is built to accommodate
   arbitrary grid sizes.

Would it help to provide GCC with knowledge about KLON, KLAT (and thereby, KLON{1,2}, KLAT{1,2}) ?

Note that this question is less academic than it seems. We often run on the same grid for years without changing an executable, so this optimization makes sense.

Kind regards,

--
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]