This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [BENCHMARK]-mfpmath=sse should disable x387 intrinsics
- From: Uros Bizjak <uros dot bizjak at kss-loka dot si>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: Richard Guenther <richard dot guenther at gmail dot com>,gcc-patches at gcc dot gnu dot org
- Date: Sat, 27 Nov 2004 15:49:53 +0100
- Subject: Re: [BENCHMARK]-mfpmath=sse should disable x387 intrinsics
- References: <Pine.LNX.4.44.0411260709560.3274-100000@www.eyesopen.com>
Quoting Roger Sayle <roger@eyesopen.com>:
> But now the best bit, for which I'll thank you in advance. In looking
> at so much floating point code, it's become apparent that GCC's
> reg-stack.c pass can do a much better job at shuffling floating point
> registers. I was up late last night working on an improvement/rewrite
> of change_stack that should reduce the number of fxch instructions we
> generate, and replace more uses "fstp %st(x)" with "ffreep %st(0)"
> (which is faster on AMD processors). I know there are PRs in this area,
> so these changes might even make it into GCC v4.0.
Perhaps a great number of fxch instructions can be reduced by loading x87
registers in appropriate time, not at the beginning of the function
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15492).
At least this situation could easily be resolved:
fldl 4(%esp)
fldl 12(%esp)
fxch %st(1)
fpatan
ret
Instead of inserting fxch instruction, the position of two fldls could be exchanged.
Uros.