This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: long long performance
On Wed, Dec 13, 2000 at 05:21:40PM -0500, Alan Lehotsky wrote:
> At 2:02 PM -0800 12/13/00, Rob Willis wrote:
> >Thanks for the info. Assuming the longlong multiply isnt too much more
> >complex, it still does not explain the 13x slower performance over that
> >of just a normal long mulitply. The timing on my P3/600 gave the
> >following (i've already subtracted out the overhead for the loop and
> >variable assignment):
> >
> >All times are for 1 billion operations to complete.
> >
> >long long mult: 33 secs
> >long mult: 2.4 secs
>
>
> Unless I'm missing something if you double the number of
> bits you are multiplying, you have 4x as many multiplies to
> do; e.g.
>
> ab x cd => ac + ad + bc + bd (with appropriate
> shifting of the
> addends)
>
Actually it is only 3 multiplies:
ab x cd => ((a*c) << 64) + (((a*d) + (b*c)) << 32) + (b*d)
=> (((a*d) + (b*c)) << 32) + (b*d)
since you can eliminate the ((a*c) << 64) term.
Note, on x86's all multiplies must be done in EAX:EDX, so you have to do data
movement, etc. Also, I imagine that unlike most RISC chips, the multiplies are
not pipelined, so you probably have to wait for one to finish before starting
the next one (multiplies are multi-cycle instructions).
--
Michael Meissner, Red Hat, Inc. (GCC group)
PMB 198, 174 Littleton Road #3, Westford, Massachusetts 01886, USA
Work: meissner@redhat.com phone: +1 978-486-9304
Non-work: meissner@spectacle-pond.org fax: +1 978-692-4482