This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: Enabling sse2 support on pentium4 makes my fortran-program slower.
- From: Miguel Ramírez <mramirez at iua dot upf dot es>
- To: "Mikko Ranta" <ranta_mikko at hotmail dot com>,<gcc-help at gcc dot gnu dot org>
- Date: Tue, 15 Oct 2002 09:57:06 +0200
- Subject: Re: Enabling sse2 support on pentium4 makes my fortran-program slower.
- References: <F151IQMaYITbg5vQ1ig00011245@hotmail.com>
> Hello!
>
> I have a fortran 77 -code which calculates three-body problem of celestial
> mechanics so it has lot's of floating point arithemtics in it. I'm using
> Mingw at Windows and g77 from the newest version 3.2.My computer is 1.8Ghz
> Pentium4.In your homepage you say that enabling sse2 support with commands
> -msse2 (or march=pentium4) and -mfpmath=sse should increase speed of
> floating point arithmetics considerably. Examining my computer with Sisoft
> Sandra's arithmetic benchmark reveals that using sse2 should increase
> floating point arithmetics over 200%. To the point. When I enable options
> mentioned earlier, my program doesn't run any faster.In fact it runs even
a
> little slower. Do I make some mistake. Or is it because all SIMD features
> are not yet implemented on GCC. Pentium 4 doesn't really shine on floating
> point arithmetics without sse2(in fact by Sisoft Sandra it's slower than
my
> older computer Duron 800!) so enabling full sse2 is crucial for using this
> program which can calculate orbits for weeks. Any help or suggestions
would
> be highly appreciated.
>
Hi,
sse2 optimizations don't come for free: if you want to get real performance
improvement
you have to ensure that your program is written in a way that makes possible
to
take advantage of SIMD instructions:
+ Memory has to be aligned into 16-byte boundaries.
+ Your data structures must expose parallelism
+ Keep in-innerloop conditionals to a minimum
+ Avoid data dependences between vector elements
+ etc.
There are no free lunches out there!
Miguel.