This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Performance of Integer Multiplication on PIII


> On Mon, 5 Nov 2001, Jan Hubicka wrote:
> 
> > The attached patch should fix all three problems.  Your testcase still
> > does use some unwound multiplies, but runs faster on celeron machines here
> > in lab than the assembly one you supplied.
> 
> Ok.  Here are some more results including using your code.
> 
> $ gcc-3.0.2 -O2 -march=i686 read.c read-empty.c t.c && a.out
> Loop: 1.33, Code: 4.72
> Clocks: 35.16
> $ gcc -O2 -march=i686 read.c read-empty.c t.c && a.out
> Loop: 1.32, Code: 3.59
> Clocks: 26.74
> $ gcc -O2 -march=i686 read.hand.s read-empty.c t.c && a.out
> Loop: 1.30, Code: 1.95
> Clocks: 14.53
> $ gcc -O2 -march=i686 read.new.s read-empty.c t.c && a.out
> Loop: 1.32, Code: 2.32
> Clocks: 17.28
> 
> So, my code still does better on my machine, however the new assembly
> output is certainly acceptable.  Especially since you say it outperforms my
> code on your machine.  A few clock cycles won't make that much diffrence....

I guess it is because I used -fomit-frame-pointer in my tests.
You assembly code does not use ebp eighter so I guess it is fair.
That should make the few percent difference I hope.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]