This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] tuning gcc for Intel Core2

> On Wed, Nov 15, 2006 at 09:56:24PM +0100, Jan Hubicka wrote:
> > Actually on Opteron the gcc benchmark also has problems with library
> > calls to string functions (but other benchmarks improve), those are not
> > as high as in your scores.  I am not quite sure if this is not an
> > annomaly. SPECint is not really good benchmark for a string functions as
> > only benchmark that do depend on them is GCC and VORTEX. In SPECfp
> > mesa/aspi.  By effectivly making compiler to optimize out any library
> > calls you throw away any posibility for optimized library version
> > handling nicely the large blocks, so programs that do use them (such as
> > X) slows down.
> Agreed. That is reason why we excluded it in the first place.
> > 
> > Do you have any understanding what really happens to GCC to slow down as
> > much?  (sorry I didn't had chance to read the full thread, but will do
> > so tomorrow, so I am sorry if this was already explained)
> > 
> X86-64 memory functions in glibc aren't very good. AMD contributed
> a set of x86-64 memory functions. I have been working on Intel
> version. It is a long process :-(.

Ah, I forgot that there are still the two variants. SUSE distro uses the
AMD version (in somewhat older form than what was contributted) that is
also reason why my benchmarks are not that strict.  We used to see same
regression on GCC as you do measure in early x86-64 days, so it is
probably the fault of current glibc implementation.

It would be nice to have something comitted to glibc that solves the
current situation even if it is not completely perfect (ideally the
glibc should probably be able to query cache sizes for large blocks and
do similar tricks that might need time to push into mainline veersion).
Are there some major algorithmic differences in between
Opteron/Core/Nocona needs or it all just depends on the thresholds for
various algorithm (I found out that those chips do have quite
conflicting requireemnets regarding rep/movs sequence.).
And I really do hope to have my memcpy patches finally ready this

> H.J.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]