This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: State of C99 _Complex?


Gabriel Dos Reis wrote:
I find that very surprising because std::complex<double> is more or
less a wrapper around __complex__ double, which is supposed to be
_Complex double.

That was why I spent a couple of hours experimenting with the code, trying to root out any outside influences that might be adding overhead.


Here's the C loop being timed; note that, for this message, I've removed initialization and timing code from before and after this fragment. The timer encloses this loop in a stopwatch, so initialization code isn't being timed:

for (int x = 0; x < IMAGE_SIZE; ++x)
{
    for (int y = 0; y < IMAGE_SIZE; ++y)
    {
        double complex c = (START_X+x*step) + (START_Y-y*step) * I;
        double complex z = 0.0;

        for (n = 0; n < MAX_ITER; ++n)
        {
            z = z * z + c;

            if (cabs(z) >= ESCAPE)
                break;
        }
    }
}

Yes, it's a Mandelbrot calculation -- a piece of example code from a book I'm writing (no, it's not a fractal book -- but fractals make pretty examples!)

I've stripped quite a bit of code from the original program, trying to find the essence.

Rewritten for C++, the loop looks like this:

for (int x = 0; x < IMAGE_SIZE; ++x)
{
    for (int y = 0; y < IMAGE_SIZE; ++y)
    {
        complex<double> c(START_X+x*step,START_Y-y*step);
        complex<double> z = 0.0;

        for (n = 0; n < MAX_ITER; ++n)
        {
            z = z * z + c;

            if (abs(z) >= ESCAPE)
                break;
        }
    }
}

Pretty similar, eh? Now, here are some fun numbers (just to show you that I don't just pick on GCC!):

Compiler   Language  Time
---------  --------  ----
Intel 7.1    C         8
Intel 7.1    C++      45 (!!)
gcc   3.3    C        26 (!)
g++   3.3    C++      15

Because the "complete" program generates a PNG, I can (and have) verified that all four executables produce the correct image. I've also commented out the absolute value call and the bit manipulation -- which makes no difference in relative run times.

Here are the command lines (note that other switches don't seem to make a significant difference).

gcc -lrt -lm -march=pentium4 -O3 -o man_gcc manbench.cpp
g++ -lrt -lm -march=pentium4 -O3 -o man_cpp_gcc manbench.cpp
icc -lrt -lm -i_dynamic -tpp7 -xW -O3 -o man_icc manbench.c
icc -lrt -lm -i_dynamic -tpp7 -xW -O3 -o man_cpp_icc manbench.cpp

I am rather baffled; the problem isn't cabs/abs, and I don't see how the C++ constructor implies different overhead from the calculated initialization in the C code. If anything, I'd expect the C++ to be SLOWER, not FASTER (at least in gcc's case).

If this isn't a "known" problem, I'll study the generated assembly code some more. Nothing leaped out at me in my first glance at the .s files...

Such a simple piece of code, and it's giving both icc and gcc some really weird indigestion!

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Professional programming for science and engineering;
Interesting and unusual bits of very free code.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]