This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PR 15492: floating-point arguments are loaded too early to x87 stack


Hello!

I would like to bring this PR ( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15492 ) to the attention of gcc developers. The problem, described in this PR has big impact on FP calculations, because fp-stack is wasted with register copies and a lot of unnecessary fxch instructions are generated.

A simple testcase:
double test (double a, double b) {
               return a*a +  b*b;
}

Current (Aug. 19) mainline CVS gcc generates:

with "gcc -O2 -fomit-frame-pointer":
test:
       fldl    4(%esp)
       fldl    12(%esp)
       fxch    %st(1)
       fmul    %st(0), %st
       fxch    %st(1)
       fmul    %st(0), %st
       faddp   %st, %st(1)
       ret

and without optimization, "gcc -fomit-frame-pointer":
test:
       fldl    4(%esp)
       fmull   4(%esp)
       fldl    12(%esp)
       fmull   12(%esp)
       faddp   %st, %st(1)
       ret

According to "How to optimize for the Pentium family of microprocessors" by Agner Fog, "fld r/m32/m64" consumes one clock cycle on P1, PMMX, PPRO, P2, P3 and P4 in all its forms. As it is shown, gcc actually de-optimizes code with "-O2".

In PR 15492, a couple of other examples are shown.

This shows, how serious problem could be:
gcc -ffast-math -S -O2 almabench.c
grep fxch almabench.s | wc -l
   114

gcc -ffast-math -S almabench.c
grep fxch almabench.s | wc -l
     5

I belive that this problem also affects PR 13712: "Executable runs 25% slower than when compiled with INTEL compiler" ( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13712 ).

I was trying to look into this problem, and I have found that -fno-schedule-insns produce a little bit better code (but not even close to the code without -O), but it looks that problem is inside RTL generator.

Could somebody with more knowledge of gcc help to solve this problem?

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]