This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gfortran: the good news and the good news


On Thu, Nov 5, 2009 at 19:45, Hans Horn <hannes@2horns.com> wrote:
> Folks,
>
> I've made the jump from gcc3/g77 to gcc4/gfortran on a large C/f77 quantum
> chemistry project and like to report great success. Not only does the code
> compile & link (after a few initial burps) but the resultant binaries are on
> average 10% faster while producing identical results (that is on my Intel
> core duo laptop under cygwin/WinXP).
>
> gcc (GCC) 4.3.4 20090804
> GNU Fortran (GCC) 4.5.0 20091014
>
> Congrats to y'all!!!
>
> These are the optimization flags I used for fortran (basically carried over
> from g77):
>
> -O3

> -fno-strict-aliasing

Fix the code to not rely on broken aliasing assumptions, and you can
remove this option and potentially go faster.

> -Winline

I'm not sure this does anything with gfortran. Generally, gfortran's
inlining is pretty limited at the moment; the hope is that the
maturing of the whole-file and LTO stuff will improve this area.

> -fexpensive-optimizations

Enabled at levels -O2, -O3, -Os.

No need to specify explicitly

> -finline-functions

Enabled at level -O3.

No need to specify explicitly

> -finline-limit=100000

This _might_ help, or it might not due to increasing cache usage.
Unless I have benchmarks that say otherwise for a particular app with
a particular compiler version, I wouldn't use it. Also note the above
remark about deficiencies in gfortran's inlining functionality.

> -fstrength-reduce

I can't find this in the manual, actually. Perhaps it has been removed.

> -fgcse
> -fgcse-lm

Enabled at levels -O2, -O3, -Os.

No need to specify explicitly

> -fgcse-sm

Not enabled at any optimization level.

Suggests that, again, without specific benchmarks telling otherwise,
might not be a good idea.

> -funroll-loops

This is commonly used for numerical code, OTOH like excessive inlining
it might also make the code larger for little gain.

> -fforce-addr

Not found in the manual; does it do anything?

> -fomit-frame-pointer

Enabled at levels -O, -O2, -O3, -Os.

No need to specify explicitly

> -malign-double

As already mentioned, should not be used since it breaks the ABI.

> Did I omit any important one?

Generally, you shouldn't be using any of these "exotic" non-default
options unless you have benchmarked them. They might, or might not,
help. And of course, the further you stray from the normal -ON
options, the bigger chance for hitting compiler bugs.

You could try

-march=core2 -mfpmath=sse

to see if using SSE math and vectorization (enabled by default at -O3) helps.

-ffast-math

will often make numerical code substantially faster, but _unless_ you
understand what these options do, and how they affect your program,
stay away.

-mrecip

Same issues as with -ffast-math really.

If profiling shows a hot loop, check with

-ftree-vectorizer-verbose=N

that the loop is vectorized, and if not, see if there is a simple way to fix it.

Finally, as you mentioned this is a quantum chemistry program, it's
likely heavily dependent on linear algebra. An optimized BLAS library
can make a huge difference here, if you for some strange reason aren't
using such a thing already.


-- 
Janne Blomqvist


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]