This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Floating point trouble with x86's extended precision

From: Jim Wilson <wilson at tuliptree dot org>
To: Volker Reichelt <reichelt at igpm dot rwth-aachen dot de>
Cc: lucier at math dot purdue dot edu, gcc at gcc dot gnu dot org
Date: Thu, 21 Aug 2003 11:50:59 -0700
Subject: Re: Floating point trouble with x86's extended precision
References: <200308211350.h7LDoVaX014090@relay.rwth-aachen.de>

Volker Reichelt wrote:

Just to make sure I get this right: The register is spilled with
the last bits truncated (cut off) instead of rounded, right?

It is just an fst/fstp instruction. According to the manual I have, it does a rounding operation before the store.

And one question out of curiosity. What happens to values in the FPU
that finally get written into memory *not* because the floating point
stack is filled, but because of other reasons (like the variable y in my
example). Do they also suffer from not being rounded but truncated?
I'd think so, right?

It is the same fst/fstp instruction, so it is the same rounding.

But I think the difference between rounding and truncating is not the
problem for the users.

The problem isn't that the value is rounded. The problem is that the value is rounded at unpredictable locations, making it impossible to compensate for the rounding.

You'll get much better results with the extended precision than without
(since I get 1 rounding to 64 bits instead of 10000).

No one here is saying that extended precision is bad. It is very important for correct results for some algorithms. That is why the IEEE FP standards require its presense.

However, the problem with the x86 fp reg-stack is that you can't control when the extended precision is used. It is always used, whether you want it or not. This makes it very hard for a compiler to get correct results without sacrificing performance, accuracy, or access to long double. There is really no good solution to this problem that makes everyone happy.

This is a design from the 1980's. It seemed like a good idea at the time, but experience has shown it was a mistake, and no one designs FP hardware like that anymore. People who did design FP hardware like that have since fixed the mistake. The original m68k FPU (68881) had this mistake, and Motorola fixed it in the 68040/68060. As I mentioned before, Intel and AMD have both fixed the x86 mistake with Itanium and AMD64 respectively.

i) You could say that accurate results are obtained, if you do the
   rounding to 64 bit after each computation.
   (That has the consequence that the result does not change with
   optimization, if computations aren't rearranged using commutativity
   or associativity etc.)

This is the ideal situation. All gcc targets work this way except x86 (if using the reg-stack not the SSE registers) and m68k (pre-68040).

ii) You could say that accurate results are obtained, if the final
    result is close to the exact arithmetic result. This means, that you
    should postpone rounding to 64 bit as long as possible.
    (In my understanding this is what GCC currently does - if one
    ignores the fact that it doesn't round, but truncate in several
    cases, right?)

Modulo the gcc bugs, yes, this is what we get. This isn't very useful unless we fix the bugs though.

But you said it yourself: "In this case, I think we have to admit that both
viewpoints are valid, and then agree to disagree."

Yes, I think that is a good way to state my position.

Your proposed fixes all try to enforce definition i) of accuracy, but
I think definition ii) is also a valid position.

That is a valid criticism.

The only thing GCC can really blamed for is that there's no option to turn
on the workaround in c). Therefore, I'd rather call this a missing feature.

There are two things missing. The abililty to turn on the workaround in c (i.e. emit FP rounding instruction after every operation), and a bug fix for the register spills.

Even with those things, I think we are still in trouble. In the first case, having explicit rounding instructions eliminates the excess precision problem, but it introduces a double-rounding problem. So we have lost again there. This is probably not fixable without changing the hardware. In the second case, fixing the problem with reload spills eliminates one source of unpredictable rounding, however, we still have the problem that local variables get rounded if they are allocated to the stack and not rounded if they are allocated to an FP reg-stack register. Thus we still have the problem that we get different results at different optimization levels. So we still lose there again also. This might be fixable by promoting all stack locals to long double to avoid unexpected rounding, but that will probably cause other problems in turn. It will break all programs that expect assignments doubles will round to double for instance. If we don't promote stack locals, then we need explicit rounding for them, and then we have double-rounding again.

I really see no way to fix this problem other than by fixing the hardware. The hardware has to have explicit float and double operations. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

References:
- Re: Floating point trouble with x86's extended precision
  - From: Volker Reichelt

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]