This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: gfortran: the good news and the good news
- From: Janne Blomqvist <blomqvist dot janne at gmail dot com>
- To: Hans Horn <hannes at 2horns dot com>
- Cc: fortran at gcc dot gnu dot org
- Date: Thu, 5 Nov 2009 23:17:25 +0200
- Subject: Re: gfortran: the good news and the good news
- References: <hcv2vo$djn$1@ger.gmane.org>
On Thu, Nov 5, 2009 at 19:45, Hans Horn <hannes@2horns.com> wrote:
> Folks,
>
> I've made the jump from gcc3/g77 to gcc4/gfortran on a large C/f77 quantum
> chemistry project and like to report great success. Not only does the code
> compile & link (after a few initial burps) but the resultant binaries are on
> average 10% faster while producing identical results (that is on my Intel
> core duo laptop under cygwin/WinXP).
>
> gcc (GCC) 4.3.4 20090804
> GNU Fortran (GCC) 4.5.0 20091014
>
> Congrats to y'all!!!
>
> These are the optimization flags I used for fortran (basically carried over
> from g77):
>
> -O3
> -fno-strict-aliasing
Fix the code to not rely on broken aliasing assumptions, and you can
remove this option and potentially go faster.
> -Winline
I'm not sure this does anything with gfortran. Generally, gfortran's
inlining is pretty limited at the moment; the hope is that the
maturing of the whole-file and LTO stuff will improve this area.
> -fexpensive-optimizations
Enabled at levels -O2, -O3, -Os.
No need to specify explicitly
> -finline-functions
Enabled at level -O3.
No need to specify explicitly
> -finline-limit=100000
This _might_ help, or it might not due to increasing cache usage.
Unless I have benchmarks that say otherwise for a particular app with
a particular compiler version, I wouldn't use it. Also note the above
remark about deficiencies in gfortran's inlining functionality.
> -fstrength-reduce
I can't find this in the manual, actually. Perhaps it has been removed.
> -fgcse
> -fgcse-lm
Enabled at levels -O2, -O3, -Os.
No need to specify explicitly
> -fgcse-sm
Not enabled at any optimization level.
Suggests that, again, without specific benchmarks telling otherwise,
might not be a good idea.
> -funroll-loops
This is commonly used for numerical code, OTOH like excessive inlining
it might also make the code larger for little gain.
> -fforce-addr
Not found in the manual; does it do anything?
> -fomit-frame-pointer
Enabled at levels -O, -O2, -O3, -Os.
No need to specify explicitly
> -malign-double
As already mentioned, should not be used since it breaks the ABI.
> Did I omit any important one?
Generally, you shouldn't be using any of these "exotic" non-default
options unless you have benchmarked them. They might, or might not,
help. And of course, the further you stray from the normal -ON
options, the bigger chance for hitting compiler bugs.
You could try
-march=core2 -mfpmath=sse
to see if using SSE math and vectorization (enabled by default at -O3) helps.
-ffast-math
will often make numerical code substantially faster, but _unless_ you
understand what these options do, and how they affect your program,
stay away.
-mrecip
Same issues as with -ffast-math really.
If profiling shows a hot loop, check with
-ftree-vectorizer-verbose=N
that the loop is vectorized, and if not, see if there is a simple way to fix it.
Finally, as you mentioned this is a quantum chemistry program, it's
likely heavily dependent on linear algebra. An optimized BLAS library
can make a huge difference here, if you for some strange reason aren't
using such a thing already.
--
Janne Blomqvist