This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

long long / long long

To: Jan Hubicka <jh at suse dot cz>
Subject: long long / long long
From: Frank Klemm <pfk at fuchs dot offl dot uni-jena dot de>
Date: Sun, 9 Sep 2001 04:02:34 +0200
>Received: (from pfk@localhost)by fuchs.offl.uni-jena.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id EAA05925;Sun, 9 Sep 2001 04:02:34 +0200
Cc: gcc at gcc dot gnu dot org
References: <20010908153112.8DAF1F2B62@nile.gnat.com> <20010908181701.K8451@atrey.karlin.mff.cuni.cz>

---- Code ----------------------------------------------------

.text
.type   __divdi3,@function
.global __divdi3

__divdi3:
        fildll  12(%esp)
        fildll   4(%esp)
        subl    $12,%esp
        movl    %esp,%ecx
        movw    $0x0C00,%ax
        fnstcw  (%ecx)
        orw     0(%ecx),%ax
        movw    %ax,2(%ecx)
        fldcw   2(%ecx)
        fdivp
        fistpll 4(%ecx)
        fldcw   0(%ecx)
        movl    4(%esp),%eax
        movl    8(%esp),%edx
        addl    $12,%esp
        ret



---- "Benchmark": Duration of a loop of --------------------------

    long long  x [1000];
    long long  y [1000];

    for (i = 0; i < 1000; i++)
        s += x[i] / y[i];


---- results ---------------------------------------------------- 
Old routine on Athlon:
	106 clocks including the a outer loop and storing the arguments on the stack.
	
This routine on Athlon:
	79 clocks including the a outer loop and storing the arguments on the stack.

  + shorter
  + can be inlined
  + sometimes the rounding control switch can be moved avoided by moving it outside a loop
  + faster for a lot of data
  - slower for trivial data (?)
  - do not work with SSE2 (needs 63 or 64 bit mantissa)

---- optimization -----------------------------------------------
This routine on Athlon after inling and moving fstcw/fldcw outside the loop:
	21 clocks including the a outer loop


Interested? Or are 64 bit are uninteresting for benchmarks?

-- 
Frank Klemm


Still remaining:
	long long % long long
	long long / long
	long long % long
	long long / const
	long long % const

Follow-Ups:
- Re: long long / long long
  - From: Joe Buck

References:
- Re: Multiplications on Pentium 4
  - From: dewar
- Re: Multiplications on Pentium 4
  - From: Jan Hubicka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]