This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Fldcw, rounding and optimizations
- From: Sebastien Loisel <loisel at math dot mcgill dot ca>
- To: Jamie Lokier <jamie at shareable dot org>
- Cc: "Loisel, Sebastien" <loisel at amazon dot com>, Fergus Henderson <fjh at cs dot mu dot OZ dot AU>, Andrew Pinski <pinskia at physics dot uc dot edu>, <gcc at gnu dot org>
- Date: Sat, 16 Aug 2003 04:57:54 -0400 (EDT)
- Subject: Re: Fldcw, rounding and optimizations
On Sat, 16 Aug 2003, Jamie Lokier wrote:
> Loisel, Sebastien wrote:
> > > 2) Is there a way to do this already? I know you can already specify
> > > dependencies of asm blocks on variables and such, and I presume this
> > > is used in the optimizer to prevent foul-ups. What's the best way of
> > > convincing gcc not to move my asm blocks around?
> >
> > Nothing to see here, move along. I just found out about asm volatile. Sorry for the noise.
>
> Does it fix your problem?
Well it helped, but it seemed to generate really poor code. My best
solution thus far is this:
int __roundceil=DEFAULT_CW|0x800;
#define RU "fldcw __roundceil\n"
template <class T>
inline T add_hi(T a, T b)
{ asm (RU "fadd %1,%2" : "=&t"(a) : "f"(b), "0"(a)); return a; }
This way, the optimizer is still allowed to reshuffle fpu operations, but
keeps the rounding modes straight as much as I need them to be.
The code is still not super good (I get an fadd every fourth or
fifth instruction.)
I'm still not 100% certain the generated code is correct, I'll have to
write extensive testing software to determine that.
What I fear though is that it may be very difficult to write efficient
interval arithmetic code with gcc. The portable code (using fenv.h) is
bound to be slow, with one call per rounding mode change
(bits/fenvinline.h really ought to have asm volatile inlines -- but even
if it did, the generated code would suck.) The nonportable approach I'm
using above seems to do a bit better (I'll be exercising it later to make
sure) but I think it's still less good than what one might want.
Cheers,
Sebastien Loisel