This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Performance of Integer Multiplication on PIII



----- Original Message -----
From: <pete@ltoi.iap.physik.tu-darmstadt.de>
To: <kevin@atkinson.dhs.org>
Cc: <gcc@gcc.gnu.org>
Sent: Saturday, November 03, 2001 7:36 AM
Subject: Re: Performance of Integer Multiplication on PIII


> Hi!
>
> A slight step off-topic.
>
> -Regarding architectural differences between the (now) 4 different P6
>  incarnations P6 Model 1 to 4 (aka PPro, PII, PIII and PIII-Tualin):
>  There are no significant differences between them, that justify
special
>  -march sub options (at least in 32 bit mode) for gcc.
>  Whithin a small margin, they perform all equal, if you "divide out"
the
>  differences of the evironments (Chipsets, RAM & RAM-speed, CPU
internal
>  speed,...)
>
> -In respect of the PIV (as public relations call it): This is just a
>  different CPU. So much different, that i suggest, we are better of,
not
>  to biase -march=i686 for it's various quirks. I.e.: For this CPU, one
>  should indeed use a new arch sub option.
>  (I'am not informed about to what extent this is done in gcc-3.1)
>
gcc-3.1 has -march=pentium4 for the P4.  Of course, subsequent NetBurst
versions should reduce a few of the excessive operation costs.
>
> -On topic:
>  (See below)
>
> Hope that helps,
>
>  Peter Schorsch
>
> > When running these same tests on on Mobile Pentium MMX
(using -march=i586)
> > Gcc code does out perform mine.  I do not have anything in between to
run
> > these tests on so I would appreciate it if someone with a Pentium Pro
and
> > PII (or is that the same thing as a Pentium Pro?) could run them and
post
> > the results.
>
>  Form Agner Fog (http://www.agner.org/assem/) pentopt.zip
>
>                  PPlain      PMMX    PPro    PII   PIII
>  IMUL latency       9          9       4      4      4
>  IMUL throughput   1/9        1/9     1/1    1/1    1/1
>
>  That means, imul is pipelined on i686 ...
>
> > So I guess the lesson here is that on PIII integer multiplication is
fast
> > enough that doing special tricks to avoid integer multiplication will
hurt
> > performs in stead of helping it.
>
Even on the P4, code which permits full pipelining will run well with
imul, while the add and shift sequences are preferable in contexts where
that is not possible.  I haven't seen any compiler which is able to
distinguish those situations.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]