This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [BENCHMARK] 3.4.4-pre Povray official benchmark scores for 387 and SSE insn sets


On Tue, 25 Jan 2005 16:05:14 +0100, Uros Bizjak <uros@kss-loka.si> wrote:
> Giovanni Bajo wrote:
> 
> >>Following are the scores for a Povray official benchmark on P4-3.2GHz,
> >>800MHz FSB.
> >>gcc version 4.0.0 20050124 (experimental)
> >>
> >>
> >
> >Is it possible to have a comparison with 3.4.3?
> >
> >
> Here are the results for gcc version 3.4.4 20050125 (prerelease):
> 
> -mfpmath=sse
> Time For Parse:    0 hours  0 minutes   1.0 seconds (1 seconds)
> Time For Photon:   0 hours  0 minutes  38.0 seconds (38 seconds)
> Time For Trace:    0 hours 27 minutes  16.0 seconds (1636 seconds)
>     Total Time:    0 hours 27 minutes  55.0 seconds (1675 seconds)
> 
> -mfpmath=387
> Time For Parse:    0 hours  0 minutes   2.0 seconds (2 seconds)
> Time For Photon:   0 hours  0 minutes  40.0 seconds (40 seconds)
> Time For Trace:    0 hours 29 minutes  40.0 seconds (1780 seconds)
>     Total Time:    0 hours 30 minutes  22.0 seconds (1822 seconds)
> 
> However, it should be noted that in -mfpmath=sse case, we are comparing
> apples to oranges. In 3.4.4, math builtins (sin, cos, atan, log, exp)
> are _enabled_, but in 4.0, these builtins were disabled for
> -mfpmath=sse, because it was shown that on *x86_64* optimized SSE math
> libraries are faster and more accurate and that these builtins interfere
> with SSE math in some unwanted way (x87 - SSE register shuffling). We
> are talking about a significant portion of math functions here:
> 
> grep sin povray_asm_34.sse | wc -l
>      98
> grep cos povray_asm_34.sse | wc -l
>      76
> grep fscale povray_asm_34.sse | wc -l
>      65
> grep fpatan povray_asm_34.sse | wc -l
>      38
> ... etc ...
> 
> So in this case, gcc_34 took the best from both "worlds".
> 
> Unfortunatelly, on 32bits, we are stuck with an old API, where passed FP
> parameters have to be dragged to and from memory (instead of being
> passed into SSE regs) and where return value is returned in x87 reg. So
> on 32bits, every function call in fact _forces_  register shuffling that
> we are trying to avoid by disabling x87 math builtins.

Couldn't we (in theory) for fpmath=sse ship with sse intrinsics for these
either inline or bundled with libgcc?  I believe the Intel compiler does this,
or something similar.

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]