This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] tuning gcc for Intel Core2

> On Mon, Nov 13, 2006 at 12:01:17PM -0500, Vladimir Makarov wrote:
> >  Here is the patch for tuning gcc for Intel Core2 processor.  I did
> > about 30 SPEC2000 runs to find good parameters which are practically
> > the same what Intel gave and recommended in their optimization guide
> > made public a few days ago.
> > 
> >  The patch increases SPECINT2000 score to 1963 from 1925 (for
> > generic) or 1901 (for nocona).  SPECFP2000 sore is the same as for
> > generic 1875 (nocona has 1856).  One benchmark (gcc) did particular
> > well -- about 20% improvement (1788 for generic tuning vs 2210 for
> > core2).  The size of code generated for Core2 is smaller (0.46% for
> > SPECInt and 0.54% SPECFp) than one for generic.
> > 
> This patch is the first step for Core 2 optimization. But I am afraid
> that it isn't very useful. I compared -mtune=generic and -mtune=core2
> on Core 2. Tha main difference is in 176.gcc. What this patch does is
> to turn on
> x86_rep_movl_optimal
> for Core 2, which will avoid external calls to memset. There
> are known serious performance problems with x86-64 memory functions,
> especially on Core 2. We are working on improving x86-64 memory
> functions. A better external memset can improve gcc in SPEC CPU 2K by
> more than 20%.
BTW I still have patch for memcpy/memset generation that allows you to
chose in between basic algorithm (rep/movq,rep/movl,loop, unrolled loop,
library call) based on -mtune switch and expected size of copied block.
It also has simple benchmark utility that allows you to set proper
limits.  I am happy to see -mtune=core2 on place as that patch contained
also basic -mtune=core2 switch and I was basically holding it because I
didn't had time to play with the ohter arguments curefuly enough and
because the profile driven memcpy/memset is infrastructure is not at
place yet.

I will be sending it shortly (I have non-GCC deadline to meet at 23rd,
so probably after that)

What is the particular problem in x86-64 library string functions making
core2 unhappy about them? 20% sounds quite serious and I don't remember
anything particularly crazy about the implementation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]