complex<double>::norm() -- huge slowdown from egcs-2.91.66 to 3.0.3

Jeroen Nijhof Jeroen.Nijhof@marconi.com
Wed Feb 20 09:16:00 GMT 2002


>Originator:   Jeroen Nijhof
>Organization: Marconi
>Confidential: no
>Synopsis:     complex<double>::norm() -- huge slowdown from egcs-2.91.66
to 3.0.3
>Severity:     non-critical
>Priority:     low
>Class:        pessimizes-code
>Release: 3.0.3
>Environment:
System: Linux sfup03.stratford 2.2.17-mosix #7 SMP Tue Oct 24 14:33:47 BST
2000 i686 unknown
Architecture: i686

     <machine, os, target, libraries (multiple lines)>
host: i686-pc-linux-gnu (2 x Pentium III 667 MHz)
build: i686-pc-linux-gnu
target: i686-pc-linux-gnu
configured with: ../gcc-3.0.3/configure --prefix=/usr/local

>Description:

gcc-3.0.3 (or rather  its libstdc++) generates code for complex<double>::norm
which might be more accurate, but which is a lot slower than 're * re + im * im'.

Context: in the split step Fourier algorithm, I need to calcuate
u[j] <- u[j] * exp(I * cst * |u[j]|^2) (in the "time domain", interleaved with FFTs, and multiplications in the "frequency domain").
With the gcc-3.0.3, my program becomes 40 % slower than egcs-2.91.66. The example below is even twice as slow.

I do not know how much more accurate the 3.0.3 calculation is, but certainly for my purposes it
is not worth the 40 % extra running time -- I guess I could override the norm() definition by a
non-templatized one, but I'ld rather have the fast definition in the library.

The current implementation in __Norm_helper<true> boils down to:  (z = x + I * y)
s = max (abs(x), abs(y)); a = s * sqrt( (x/s)^2 + (y/s)^2); norm = a * a.
Would the increased accuracy disappear if the sqrt() is eliminated, by returning (s*s) * ( (x/s)^2 + (y/s)^2)?
The max() doesn't seem get inlined with the default -finline-limit with -O3, by the way.


>How-To-Repeat:
example.cc given below; g++ is egcs-2.91.66 (Redhat 6.2)'s g++.

g++  -O3 -mcpu=pentiumpro -DUSE_NORM -o old_norm example.cc
g++-3.0.3 -O3 -mcpu=pentiumpro -DUSE_NORM -o new_norm example.cc  -finline-limit=9999
g++  -O3 -mcpu=pentiumpro -o old_separate example.cc
g++-3.0.3 -O3 -mcpu=pentiumpro -o new_separate example.cc -finline-limit=9999

times:
//              elapsed user    system
// old_norm     1.10    1.10    0.01
// new_norm     2.20    2.19    0.01
// old_separate 1.10    1.10    0.00
// new_separate 1.13    1.13    0.00

// example.cc
#include <complex>
typedef std::complex<double> Complex;

int main(int argc, char *argv[]) {

  Complex u[2048];
  for (int i = 0; i < 2048; ++i)
    u[i] = 1.0;

  for (int i = 0; i < 2000; ++i) {
    Complex * p = u;
    for(unsigned int i = 0; i < 2048; ++i) {
#ifdef USE_NORM
      double u2 = norm(*p);
#else
      double ur = real(*p); double ui = imag(*p);
      double u2 = ur * ur + ui * ui;
#endif
      double t = u2 * 0.1;
      *p *= Complex(cos(t), sin(t));
// in my real program, I define _GNU_SOURCE and use libgcc's sincos() instead of sin(), cos().
      ++p;
    }
  }
}
// end of example.cc




------------
This e-mail and any attachments are confidential.  If you are not the intended recipient, please notify us immediately by reply e-mail and then delete this message from your system. Do not copy this e-mail or any attachments, use the contents for any purpose, or disclose the contents to any other person: to do so could be a breach of confidence.



More information about the Gcc-bugs mailing list