This is the mail archive of the
mailing list for the GCC project.
Re: GCC 4.0, Fast Math, and Acovea
- From: Uros Bizjak <uros at kss-loka dot si>
- To: scott dot ladd at coyotegulch dot com, gcc at gcc dot gnu dot org
- Date: Fri, 29 Apr 2005 21:35:53 +0200
- Subject: Re: GCC 4.0, Fast Math, and Acovea
Specifically, the -funsafe-math-optimizations flag doesn't work
correctly on AMD64 because the default on that platform is
-mfpmath=sse. Without specifying -mfpmath=387,
-funsafe-math-optimizations does not generate inline processor
instructions for most floating-point functions.
Let's put it another way: Manually selecting -mfpmath=387 cuts
run-times by 50% for programs dependent on functions like sin() and
sqrt(), as compared to -funsafe-math-optimizations by itself.
It was found that moving data from SSE registers to X87 registers (and
back) only to call an x87 builtin degrades performance. Because of this,
x87 builtins are disabled for -mfpmath=sse and a normal libcall is
issued for sin(), etc functions. If someone wants to use x87 builtins,
then _all_ math operations should be done in x87 registers to avoid
costly SSE->x87 moves.
BTW: Does adding -D__NO_MATH_INLINES improve performance for
-mfpmath=sse? That would be PR19602.