This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: SH optimized software floating point routines


Quoting Christian Bruel <christian.bruel@st.com>:

> About the other part of your answer, non supporting SNaNs in the
fp-bit.c, it is a possibility that I didn't consider in my fix. This
restriction is quite a surprise to me because, related to NaNs, it is
not what I guess from the implementation of the fp-bit.c's isnan
function that does check for CLASS_SNAN, and CLASS_QNAN.

Well, it looks like a classic top-down implementation, carving up the problem in little sub-problems, and then not implementing some of these so that the case distinction between CLASS_SNAM and CLASS_QNAN becomes pointless.

See for example the result of

static int misnanf(float v)
{
  return (v != v);
}

called with either a QNaN or a SNaN. IMO The assembly model should have
the same semantic that the C model, which is not the case today.

I would consider the exact bit patterns used for NaNs an implementation detail, which the user should not need to care about. We only implement QNaNs. fp-bit.c recognizes all NaN patterns, but treats them all as QNaNs.

Using -fsignaling-nans and eventually putting #ifdef  __SUPPORT_SNAN__
around the checking doesn't change anything since the same call is done
to the floating point comparison function, that really needs to check
for both formats.

Considering that the signals don't work, wouldn't a better implementation of -fsignaling-nans be to issue a diagnostic when using this for a software floating point ABI in sh.h OVERRIDE_OPTIONS ? And somehow make using __builtin_nans / __builtin_nansf give a diagnostic, too.

Unless you want to go further and really implement the signals.
I suppose you could use config/soft-fp for that.

If your are concerned about the extra cycles needed

Both cycles and bytes.


in the nesf2f implementation (wich is nothing anyway compared to the C
model),

fp-bit is so slow that it can't be taken seriously as a benchmark for software floating point emulation speed. The point of having a hand-optimized assembly version is that you actually can show reasonable performance for codes with light fpu usage, compared to a processors with hardware floating point (which needs more die space and power, and might not clock as high as the fpu-less version). IIRC some EEMBC benchmarks are in that class, i.e. with the hand-optimized software floating point they run several times faster than with fp-bit, but going all the way to hardware floating point then gives diminishing returns.

we could certainly provide a specialized one just for
-fsignaling-nans.

You'd also have to handle the other comparisons. grep for F_NAN_MASK in ieee-754-sf.S / ieee-754-df.S.

The original intent was that the faster & more compact NaN check would
be available for all the software emulation code, although I used a more
inclusive check if I saw it could be done with the same cycle count.
I can't remember if I ended up using the mask check anywhere but in
ieee-754-sf.S / ieee-754-df.S .

If you want all possible IEEE NaN patterns to be honoured, someone should
check all these checks in the config/sh/IEEE-754/m3 directory...


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]