This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Floating point to int casts

----- Original Message -----
From: "Erik de Castro Lopo" <>
To: <>
Sent: Thursday, October 18, 2001 2:19 PM
Subject: Floating point to int casts

> ============
> In many applications such as audio, video and graphics
processing, calculations
> are done in floating point values but the final results need to
be converted to
> integers.
> Unfortunately, I've noticed that casting from float/double to
int on i386 can
> cause large performance hits when this operations is used
> I have also found, that if the programmer is willing to give up
a small amount
> of accuracy and deviate slightly from the rounding behaviour
defined by the C
> standards execution speed improvements of 6 to 12 times can be
achieved on
> Pentium III and Athlon CPUs on float to int cast intensive
code. (If needed,
> I can supply benchmarking code to prove this assertion).
> ========
> I've had a look at the code generated by a cast which looks
something like
> this:
> fldcw -12(%ebp)
> fistpl -16(%ebp)
> movl -16(%ebp),%eax
> fldcw -10(%ebp)
> The first and last instruction in this group modifies the FPU
control word
> (specifically modifying the FPU rounding mode). It is this
instruction which
> causes the pain as an FPU pipeline flush is required each time
it is executed.
> Removing both instances of "fldcw" can result in significant
execution speed
> increases. The downside is a slight loss in accuracy. The
maximum absolute
> error between what is obtained with the cast and assembler code
without the
> "fldcw" instructions is 1 which is perfectly acceptable for
audio and
> video/graphics processing applications.
> At the moment, the only solution to this is a rather ugly
assembler macro:
> #define FLOAT_TO_INT(in,out) \
> __asm__ __volatile__ ("fistpl %0" : "=m" (out) : "t" (in) :
"st") ;
> called as follows:
>    float in = 1233.45 ;
>    int out ;
>    FLOAT_TO_INT(in, out) ;
> This macro does fix the problem but is far from an optimal
> ==================
> One possible solution would be to provide a command line switch
like say
> -ffast-float-cast. Unfortunately, this would operate on a file
wide basis
> and it would be easy to imagine that one C file could contain
float to int
> casts which could benefit from this operation and others where
the non standard
> rounding mode might adversely affect computational accuracy.
> It would be far nicer if the modified float cast behaviour
could be switched
> on and off within a single C file. I therefore propose the use
of a new
> __attibute__ argument, fastfloatcast, which can be applied to
functions or
> any code block enclosed in { }.
> __attribute__ ((fastfloatcast))
> { /* All float to int casts here have fast but non-standard
rounding. */
> } ;
> ====================
> With all this in mind, I would like to canvas the thoughts and
opinions of the
> gcc developers.
>    1) Is this a good idea?
>    2) Is this the best way to do it?
>    3) Would this be accepted into the gcc code base if I came
up with a clean
>       patch against the latest CVS snapshot?
> Regards,
> Erik
Since there need be no performance loss in sticking to standard
C99, as discussed here within the last month, you should use
lrint(), which has already a macro version in glibc
<bits/mathinline.h>  looking much like yours.  I think Honza
mentioned the possibility of building it into gcc rather than
depending on the library, and I don't recall any objections.  Now
it may be too late for gcc-3.1.

  You are correct that supplying a compiler switch which has the
effect of converting all (int) casts into lrint() and the like,
is a bad solution, which is already available for a small price.
lrint() doesn't give up accuracy; it does exactly what the
programmer requests, exactly where it is requested.

You must be using gcc-3.0.x to see as large a performance penalty
as you quote; the correction which Honza put into gcc-3.1 is
unlikely to be carried back to 3.0.x.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]