This is the mail archive of the
mailing list for the GCC project.
Re: [BENCHMARK] 3.4.4-pre Povray official benchmark scores for 387 and SSE insn sets
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Uros Bizjak <uros at kss-loka dot si>
- Cc: Giovanni Bajo <rasky at develer dot com>, gcc at gcc dot gnu dot org
- Date: Tue, 25 Jan 2005 16:12:31 +0100
- Subject: Re: [BENCHMARK] 3.4.4-pre Povray official benchmark scores for 387 and SSE insn sets
- References: <41F5260D.email@example.com> <015b01c502d5$bdd8f890$bf03030a@trilan> <41F6602A.firstname.lastname@example.org>
- Reply-to: Richard Guenther <richard dot guenther at gmail dot com>
On Tue, 25 Jan 2005 16:05:14 +0100, Uros Bizjak <email@example.com> wrote:
> Giovanni Bajo wrote:
> >>Following are the scores for a Povray official benchmark on P4-3.2GHz,
> >>800MHz FSB.
> >>gcc version 4.0.0 20050124 (experimental)
> >Is it possible to have a comparison with 3.4.3?
> Here are the results for gcc version 3.4.4 20050125 (prerelease):
> Time For Parse: 0 hours 0 minutes 1.0 seconds (1 seconds)
> Time For Photon: 0 hours 0 minutes 38.0 seconds (38 seconds)
> Time For Trace: 0 hours 27 minutes 16.0 seconds (1636 seconds)
> Total Time: 0 hours 27 minutes 55.0 seconds (1675 seconds)
> Time For Parse: 0 hours 0 minutes 2.0 seconds (2 seconds)
> Time For Photon: 0 hours 0 minutes 40.0 seconds (40 seconds)
> Time For Trace: 0 hours 29 minutes 40.0 seconds (1780 seconds)
> Total Time: 0 hours 30 minutes 22.0 seconds (1822 seconds)
> However, it should be noted that in -mfpmath=sse case, we are comparing
> apples to oranges. In 3.4.4, math builtins (sin, cos, atan, log, exp)
> are _enabled_, but in 4.0, these builtins were disabled for
> -mfpmath=sse, because it was shown that on *x86_64* optimized SSE math
> libraries are faster and more accurate and that these builtins interfere
> with SSE math in some unwanted way (x87 - SSE register shuffling). We
> are talking about a significant portion of math functions here:
> grep sin povray_asm_34.sse | wc -l
> grep cos povray_asm_34.sse | wc -l
> grep fscale povray_asm_34.sse | wc -l
> grep fpatan povray_asm_34.sse | wc -l
> ... etc ...
> So in this case, gcc_34 took the best from both "worlds".
> Unfortunatelly, on 32bits, we are stuck with an old API, where passed FP
> parameters have to be dragged to and from memory (instead of being
> passed into SSE regs) and where return value is returned in x87 reg. So
> on 32bits, every function call in fact _forces_ register shuffling that
> we are trying to avoid by disabling x87 math builtins.
Couldn't we (in theory) for fpmath=sse ship with sse intrinsics for these
either inline or bundled with libgcc? I believe the Intel compiler does this,
or something similar.