This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Enabling sse2 support on pentium4 makes my fortran-program slower.


> Hello!
>
> I have a fortran 77 -code which calculates three-body problem of celestial
> mechanics so it has lot's of floating point arithemtics in it. I'm using
> Mingw at Windows and g77 from the newest version 3.2.My computer is 1.8Ghz
> Pentium4.In your homepage you say that enabling sse2 support with commands
> -msse2 (or march=pentium4) and -mfpmath=sse should increase speed of
> floating point arithmetics considerably. Examining my computer with Sisoft
> Sandra's arithmetic benchmark reveals that using sse2 should increase
> floating point arithmetics over 200%. To the point. When I enable options
> mentioned earlier, my program doesn't run any faster.In fact it runs even
a
> little slower. Do I make some mistake. Or is it because all SIMD features
> are not yet implemented on GCC. Pentium 4 doesn't really shine on floating
> point arithmetics without sse2(in fact by Sisoft Sandra it's slower than
my
> older computer Duron 800!) so enabling full sse2 is crucial for using this
> program which can calculate orbits for weeks. Any help or suggestions
would
> be highly appreciated.
>

Hi,

sse2 optimizations don't come for free: if you want to get real performance
improvement
you have to ensure that your program is written in a way that makes possible
to
take advantage of SIMD instructions:

       + Memory has to be aligned into 16-byte boundaries.
       + Your data structures must expose parallelism
       + Keep in-innerloop conditionals to a minimum
       + Avoid data dependences between vector elements
       + etc.

There are no free lunches out there!

Miguel.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]