This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug other/37367] New: gcc-4.4 speed regression


using a small piece of code of a digital filter, i was trying to benchmark
several looping constructs. on x86_64 the following code was running 5% faster
with g++-4.3 than with g++-4.4:

float __attribute__ ((noinline)) bench_5(float * out_sample, int n)
{
    float b1 = std::cos(0.01);
    float y1 = 0;
    float y2 = 1;

    do
    {
        float y0 = b1 * y1 - y2;
        *out_sample++ = y0;
        --n;
    }
    while (__builtin_expect(n!=0, 1));
}

tim@thinkpad:~$ g++-4.3 -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.3.2-0ubuntu3'
--with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3
--program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug
--enable-objc-gc --enable-mpfr --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.3.2 (Ubuntu 4.3.2-0ubuntu3) 

tim@thinkpad:~$ g++-4.4 -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../gcc-4.4-20080815/configure
--enable-languages=c,c++,fortran,objc,obj-c++ --enable-shared
--with-system-zlib --enable-mpfr --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
--without-included-gettext --enable-threads=posix --enable-nls
--with-gxx-include-dir=/usr/local/include/c++/4.4 --program-suffix=-4.4
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
Thread model: posix
gcc version 4.4.0 20080815 (experimental) (GCC) 

the command line to compile the code was:
g++ benchmarks/loop_benchmark.cpp -O3 -lrt -march=core2

the difference in the machine code is the order of two subl and leal
instructions:
*** 340,347 ****
        addq    $16, %rax
        cmpl    %r8d, %edx
        jb      .L61
-       subl    %r9d, %esi
        leal    0(,%r9,4), %eax
        mov     %eax, %eax
        addq    %rax, %rcx
        cmpl    %r9d, %r10d
--- 340,347 ----
        addq    $16, %rax
        cmpl    %r8d, %edx
        jb      .L61
        leal    0(,%r9,4), %eax
+       subl    %r9d, %esi
        mov     %eax, %eax
        addq    %rax, %rcx
        cmpl    %r9d, %r10d
***************

since i read that gcc-4.4 is supposed to be aimed at code optimization, i
thought it may be interesting to report it ...
the complete code can be found at http://tinyurl.com/5socts


-- 
           Summary: gcc-4.4 speed regression
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: other
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tim at klingt dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37367


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]