This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: mul + div with 64 bit signed ints on IA32
- To: "Frank Klemm" <pfk at fuchs dot offl dot uni-jena dot de>, "Jan Hubicka" <jh at suse dot cz>
- Subject: Re: mul + div with 64 bit signed ints on IA32
- From: "Tim Prince" <tprince at computer dot org>
- Date: Tue, 4 Sep 2001 15:10:37 -0700
- Cc: <gcc at gcc dot gnu dot org>
- References: <20010826202953.E2544@fuchs.offl.uni-jena.de> <20010826233634.A6693@atrey.karlin.mff.cuni.cz> <20010827004731.G2544@fuchs.offl.uni-jena.de> <20010827121624.D8568@atrey.karlin.mff.cuni.cz> <20010827143032.C636@fuchs.offl.uni-jena.de> <20010827173025.F11402@atrey.karlin.mff.cuni.cz> <20010901202854.A7713@fuchs.offl.uni-jena.de> <20010902000000.C27182@atrey.karlin.mff.cuni.cz> <20010902024104.F7713@fuchs.offl.uni-jena.de> <20010903171717.E13574@atrey.karlin.mff.cuni.cz> <20010904215156.C438@fuchs.offl.uni-jena.de>
----- Original Message -----
From: "Frank Klemm" <pfk@fuchs.offl.uni-jena.de>
To: "Jan Hubicka" <jh@suse.cz>
Cc: <gcc@gcc.gnu.org>
Sent: Tuesday, September 04, 2001 12:51 PM
Subject: mul + div with 64 bit signed ints on IA32
>
> A rounding of a 'double' to an 'int' with ANSI-C took 160
clocks on a K6-2.
> In the same time it was possible to calculate the scalar of
_two_ 1200 byte
> long vectors of float values. This is brain dead!
I don't see that you have specified which gcc version you have
chosen. The 3.0 series chooses by far the slowest method. I
don't attempt to run gcc on my K6-2 anyway. Honza has done an
excellent job of correcting this in gcc-3.1, without taking any
unusual shortcuts, implementing both x87 and SSE2. I hope you
don't advocate undoing his work in favor of something totally
strange.
>
> Here we have problems with the design of C and with the design
of the FPU
> of the iA32 architecture.
No doubt an unusually slow code generation choice could be made
for almost any architecture. It's true that the idea of a single
instruction to store float to integer with truncation toward zero
came late in the IA32 series, but the gcc-3.0 code sequence is
contrary to any recommendation ever published. Compaq just
corrected a similar problem in their compilers, before the Intel
acquisition.
>
>
> Option proposals:
>
> -fsaverc
> -ffastrc
> -fsavecld ; the same for the cld flag
> -ffastcld
>
glibc includes implementations of lrint() and the like. I'd like
to see something like -ffast-rint to support g77 (Fortran
spelling -ffast-nint), which has a precedent in the MipsPro
compilers. That only modifies the code to accept IEEE style
round-to-nearest in place of Fortran style round-to-nearest.