This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: SSE vs. x87 povray deathmatch [was: Re: [RFC PATCH, x86_64] Use -mno-sse[,2] to fall back to x87 FP ...]

From: "Menezes, Evandro" <evandro dot menezes at amd dot com>
To: "Uros Bizjak" <ubizjak at gmail dot com>
Cc: "Roger Sayle" <roger at eyesopen dot com>, "Michael Matz" <matz at suse dot de>, "Jan Hubicka" <hubicka at ucw dot cz>, "GCC Patches" <gcc-patches at gcc dot gnu dot org>, "Richard Guenther" <rguenther at suse dot de>
Date: Tue, 10 Oct 2006 16:58:01 -0500
Subject: RE: SSE vs. x87 povray deathmatch [was: Re: [RFC PATCH, x86_64] Use -mno-sse[,2] to fall back to x87 FP ...]

Uros, 

> I have re-run official povray-3.6.1 benchmark on
> 
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 47
> model name      : AMD Athlon(tm) 64 Processor 3000+
> stepping        : 2
> cpu MHz         : 1809.276
> cache size      : 512 KB
> 
> On Fedora Core 4 (2.6.11-1.1369_FC4 #1 Thu Jun 2 22:56:33 EDT 2005 
> x86_64 x86_64 x86_64 GNU/Linux) using out of the box glibc:
> 
> GNU C Library development release version 2.3.5, by Roland 
> McGrath et al.
> ...
> Compiled by GNU CC version 4.0.0 20050525 (Red Hat 4.0.0-9).
> Compiled on a Linux 2.4.20 system on 2005-05-30.

As I said, GLIBC doesn't have fast routines for x86-64.  SUSE and others do, but neither FSF nor RH do.

I'll run SPEC CPU2006 Povray which I have handy on SUSE 10.0 and post the results later.  Then I'll run 3.6.1 as well.

> I have speculated that the slowdown was due to costly SSE->mem->x87 
> moves. These moves should be avoided as much as possible, and 
> this fact 
> was already proved some time ago (this is actually the reason why x87 
> intrinsics are disabled for SSE math). To prove this 
> speculation, -msse3 
> was added to compile flags to enable generation of fisttp instruction.

Could it have removed changes to the rounding mode as well?

> So, at this 
> point x87 code of a real world application (which is BTW the 
> part of a 
> SPEC suite) beats x86_64 SSE, despite the fact that SSE has 
> two times as 
> many non-stacked FP registers and implements register passing 
> convention 
> (thus avoiding memory moves). 

That's not the correct conclusion.  As I said, you haven't isolated x87 microcode vs. GLIBC math functions...

Thanks,

-- 
_______________________________________________________
Evandro Menezes               AMD            Austin, TX

References:
- SSE vs. x87 povray deathmatch [was: Re: [RFC PATCH, x86_64] Use -mno-sse[,2] to fall back to x87 FP ...]
  - From: Uros Bizjak

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]