This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86


Oh well.  And Joe Buck writes:
 - What would the performance cost be if we spilled ix86 FP registers
 - as 80 bits?

FYI, this is Dr. Kahan's suggested fix whenever any of us (us == grad 
students at Berkeley who ask him about it) mention the truncation
problem.

Dr. Kahan's original intent was to have the on-chip stack be just the
top few cells in the total FP stack.  It's outlined in a paper from 1989
(with a 1990 prefix (modified in 1994) and a 1998 addendum) titled
``How Intel 80x87 Stack Over/Underflow Should Have Been Handled.''
It's in the FP98 notes, on the off chance any of y'all have them.

Quick summary of points from that paper (which I've seen mentioned 
here, I think, but not with details):
	* He forcasts that only 1280 bytes (128 80-bit words) of memory 
	for stack extension would be ``almost always ample.''

	* Differences in the 80x87 family make engineering the intended
	behavior nasty.  It also involves OS help for the trap handlers.
		* The 80287 has two major variants.
		* Not all opcodes are recorded the same way through the
		80x87 family.
		* The 80387 has undocumented anomalies.
		* <80387 don't support many IEEE 754 operations, and they
		would need emulated.

	* Other co-processors (namely the Weiteks) don't have 80-bit
	precision.  (Like I said, 1989.  Not so much an issue now.)

	* Some IEEE functions aren't in <80387 FPUs, making drivers
	more difficult to implement.

In this paper, part of the problem he mentions is that programs would
need to determine which FPU exists at run-time.  When he wrote the paper,
that wasn't a common operation.  The diversification of the 80x86 family 
(MMX, 3Dnow, etc) has made it quite common.

It would be _really, really, really_ nice if someone could make the 
whole thing work as intended.  It's quite possible with the free Unices 
and gcc, especially since Linux / *BSD don't bother too much about pre-
80386 chips.  I want to look at it, but I won't be able to start for 
a few months due to other commitments (and lack of understanding of the
relevant gcc / Linux / glibc code, but I'm working on it).

If there's interest, I'll try to convince Dr. Kahan to post this paper 
on-line.  It's a nice outline of the issues involved.

Jason


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]