This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Effect of -mtune=* on x86 or x86-64 systems?


On Thu, Jan 08, 2009 at 04:53:37PM -0800, Tim Prince wrote:
> Michael Meissner wrote:
> > On Wed, Jan 07, 2009 at 10:21:28AM -0500, Wirawan Purwanto wrote:
> >> Hi Michael,
> >>
> >> Thanks for the answer. I would  like to know if someone has investigated 
> >> this issue for some benchmark or real-world cases. Is there any 
> >> write-up/report/paper on this thing?
> 
> > 
> > I suspect many people have done tests, but often times not published the
> > results.  For example, when I worked for AMD, I sometimes did SPEC runs with
> > -mtune=generic, -mtune=athlon, -mtune=barcelona, or -mtune=core2 to see how the
> > tunings affected the real hardware.  I recall that there were a few benchmarks
> > which saw noticible differences (how integer to fp conversions was one that I
> > looked at for a bit).
> > 
> -mtune=barcelona frequently speeds up vectorized loops on Core i7 by more
> than a factor of 2, compared with generic.   On Core 2, of course, it's
> not clear cut, it speeds up more of my gfortran cases than it slows down,
> with the reverse being true of g++.
> There's not much mystery in this, as the major differences have to do with
> the alignment requirements of various CPU models.
> I thought integer to fp conversion would be more affected by -msse/sse2
> than by mtune.

There are about 4 different methods to convert int to float if the integer
value is in a GPR (direct GPR -> XMM conversion, Store -> Convert from memory,
Store -> Load -> Parallel convert if memory serves).  At a micro-level, AMD K8
is different from AMD Barcelona which is different from Intel Core2 which is
different from Intel P4 (I imagine Intel i7 may be different as well).  When
you get into benchmarks, some things might be faster, even if the opt. guides
say otherwise due to the effect of writing a value from a GPR to memory and
reading the same value into an XMM register.

-- 
Michael Meissner, IBM
4 Technology Place Drive, MS 2203A, Westford, MA, 01886, USA
meissner@linux.vnet.ibm.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]