This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[BENCHMARK] 3.4.4-pre Povray official benchmark scores for 387 andSSE insn sets


Giovanni Bajo wrote:

Following are the scores for a Povray official benchmark on P4-3.2GHz,
800MHz FSB.
gcc version 4.0.0 20050124 (experimental)



Is it possible to have a comparison with 3.4.3?


Here are the results for gcc version 3.4.4 20050125 (prerelease):

-mfpmath=sse
Time For Parse:    0 hours  0 minutes   1.0 seconds (1 seconds)
Time For Photon:   0 hours  0 minutes  38.0 seconds (38 seconds)
Time For Trace:    0 hours 27 minutes  16.0 seconds (1636 seconds)
   Total Time:    0 hours 27 minutes  55.0 seconds (1675 seconds)

-mfpmath=387
Time For Parse:    0 hours  0 minutes   2.0 seconds (2 seconds)
Time For Photon:   0 hours  0 minutes  40.0 seconds (40 seconds)
Time For Trace:    0 hours 29 minutes  40.0 seconds (1780 seconds)
   Total Time:    0 hours 30 minutes  22.0 seconds (1822 seconds)

However, it should be noted that in -mfpmath=sse case, we are comparing apples to oranges. In 3.4.4, math builtins (sin, cos, atan, log, exp) are _enabled_, but in 4.0, these builtins were disabled for -mfpmath=sse, because it was shown that on *x86_64* optimized SSE math libraries are faster and more accurate and that these builtins interfere with SSE math in some unwanted way (x87 - SSE register shuffling). We are talking about a significant portion of math functions here:

grep sin povray_asm_34.sse | wc -l
    98
grep cos povray_asm_34.sse | wc -l
    76
grep fscale povray_asm_34.sse | wc -l
    65
grep fpatan povray_asm_34.sse | wc -l
    38
... etc ...

So in this case, gcc_34 took the best from both "worlds".

Unfortunatelly, on 32bits, we are stuck with an old API, where passed FP parameters have to be dragged to and from memory (instead of being passed into SSE regs) and where return value is returned in x87 reg. So on 32bits, every function call in fact _forces_ register shuffling that we are trying to avoid by disabling x87 math builtins.

Uros.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]