This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: The Alpha and denormalized numbers...

Iain McClatchie wrote:

> I don't think there is a problem here, unless you use a single
> result of erfc() a lot.

I'm actually doing Monte Carlo simulations, and therefore am using a
whole range of parameters (generated from random numbers).  As a result,
I keep running into occasional SIGFPE crashes.  I would imagine this
sort of thing is going to eventually sneak up on anyone who is doing
computations on the Alpha using the GNU math library, as its software
simulated functions seem to all be able to return denormals.  For an
exp() example:

int main(){
  double lValue1,lValue2;

  //Okay -- normal
  printf("First test...\n");
  lValue2 = -708.0;
  lValue1 = exp(lValue2)*0.5;

  //Okay -- zero
  printf("Second test...\n");
  lValue2 = -746.0;
  lValue1 = exp(lValue2)*0.5;

  //Crashes -- denormalized
  printf("Third test...\n");
  lValue2 = -709.0;
  lValue1 = exp(lValue2)*0.5;

  return 0;

> You compile without -mieee.  GCC sets the FPCR to flush outputs
> to zero.  You make a call to erfc().  It generates a denormal.
> That denormal is used in subsequent calculations.  Each time it
> is used, an exception is taken, which causes the kernel to run
> the FP instruction with the input flushed to zero.

That's what my idea was (i.e. set the appropriate register to zero and
restart the instruction).  That's not what is happening now.  Right now
the program just crashes with a SIGFPE (try the above program).

> The exception takes time, but the erfc() call takes lots of
> time too.  Unless you take a bunch of exceptions, isn't the erfc()
> call going to dominate?

Yup.  But if -mieee is set, then all subsequent operations are going to
generate exceptions as well (i.e. in the above code, everything that
uses lValue1 after the last statement will generate an exceptions and a
software simulated FPU operations).  This would cumulate into a
reasonable amount of time spent doing software emulation.  If what I was
talking about was implemented, only one exception would happen (due to
the *0.5), the input would be zeroed, and the FPU could do the rest
(since it never generated denormals).

> Remember that GCC-generated code never actually sees the
> exception.  That's the kernel.  Is the idea here that GCC
> should somehow note that an exception was taken, and zero
> the argument permanently so that future operations with that
> argument will not take an exception?
> That's a kernel change, not a GCC change.

I guess I'm misunderstanding something here then.  How does GCC
implement -mieee if it doesn't add some startup code to the program to
hook the FP exception and simulate FPU operations on denormal numbers
through the hooked exceptions?  Is there an Alpha specific system call
to tell the kernel to do all this work?  Some bits in the ELF headers of
the executable?  Something else?

> GCC, or you, could instead check the output of every erfc() call,
> and flush to zero explicitly.
>   t = erfc(); if( t < min_float ) t = 0;

I just remembered that the Alpha defaults to non precise exceptions in
order to execute the code faster.  As a result, my idea for a GCC
solution would still require precise exception handling to be enabled,
and thus still slow the entire program down.  So, yup, this is actually
the best option.

However, I think glibc should do this if it detects that the code was
compiled without -mieee (examine an exported symbol?).  Otherwise,
anyone who is doing numerics has to either put in hack code to check the
returned value of all the floating point math function that are
implemented in software (so much for architecture independent code), or
accept much slower performance with -mieee...

Too bad the glibc guy(s) (only one responded, and he shot me down) don't
think this is a good idea.  Somehow crashing on such things as
exp(-722.0)*0.5, but not on exp(-708.0)*0.5 or exp(-746.0)*0.5 just does
not seem to be acceptable default behavior to me (even if its FPU is not
100% IEEE 754 compliant).  Especially on a architecture that is widely
believed to be supported, and heavily utilized in high floating point
computationally expensive applications (I ran into this on our local
Beowulf cluster) *sigh*.


 Tyson Whitehead  ( -- WSC 140-)
 Computer Engineer                          Dept. of Applied Mathematics,
 Graduate Student- Applied Mathematics      University of Western Ontario,
 GnuPG Key ID# 0x8A2AB5D8                   London, Ontario, Canada

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]