This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: GCC and Floating-Point


Uros, 

> > Actually, in many cases, SSE did help x86 performance as 
> well.  That 
> > happens in FP-intensive applications which spend a lot of time in 
> > loops when the XMM register set can be used more 
> efficiently than the x87 stack.
> 
>   This code could be a perfect example how XMM register file 
> beats x87 reg stack.
> However, contrary to all expectations, x87 code is 20% 
> faster(!!) /on p4, but it would be interesting to see this 
> comparison on x86_64, or perhaps on 32bit AMD/.
> The code structure, produced with -mfpmath=sse, is the same 
> as the code structure produced with -mfpmath=x87, so IMO 
> there is no register allocator effects in play.

I'll look into it and share what I see.
 
>   I was trying to look into this problem, but on first sight, 
> code seems optimal to me...

FWIW, here's some old data I got almost 2 years ago (run-times and geometric means of the ratios using SPEC's bases):

CPU2000	A	B
164.gzip	205s	203s
175.vpr	185s	188s
176.gcc	117s	116s
181.mcf	313s	314s
186.crafty	112s	112s
197.parser	268s	268s
252.eon	147s	167s
253.perlbmk	175s	180s
254.gap	148s	148s
255.vortex	178s	178s
256.bzip2	211s	202s
300.twolf	313s	328s
Int Geomean	812	801
177.mesa	173s	187s
179.art	346s	690s
183.equake	163s	162s
188.ammp	325s	336s
FP Geomean	757	620

Using GCC 3.3.3 from 3_3-hammer branch with the options for runs in column B were "-m32 -O3 -march=k8 -ffast-math -fomit-frame-pointer -malign-double +FDO", for column A, the same ones plus "-mfpmath=sse".  The system was a 1.4GHz Athlon 64 with PC2100 RAM.

Because things were so much better with SSE, I haven't run with x87 lately...

-- 
Evandro


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]