code optimizations and numerical research
Peter Jay Salzman
p@dirac.org
Mon May 16 14:47:00 GMT 2005
Hi all,
I'm a physicist doing research on quantum gravity. Although I'm good at
writing programs to solve all kinds of non-linear PDEs, ODEs, and integral
equations, I'm not a computer scientist and not savvy with a lot of lingo
and warnings in the gcc man/info pages. I'd like to ask for help with code
speed optimization options in gcc.
The most effective optimizations for the code I'm currently writing is:
-O3 -funroll-loops -march=athlon-xp -ffast-math
That last option, -ffast-math, is what I'd like to ask about. According to
the man page:
can result in incorrect output for programs which depend on an exact
implementation of IEEE or ISO rules/specifications for math functions.
and according to the GCC Complete Reference:
Certain calculations are made faster by violating some of the ISO and
IEEE rules. For example, with this option set it is assumed that no
negative values are passed to sqrt() and that all floating point values
are valid.
First and foremost, my code must generate correct output --- this is science
that's helping to lay groundwork for a future of theory of quantum gravity,
not a Lucas Arts SCUMM emulator. :)
So what happens when the assumptions mentioned above are NOT true? For
example, in this code:
int main(void)
{
double d = sqrt(-1);
printf("%f\n", d);
return 0;
}
the behavior appears to be the same whether it's compiled with -ffast-math
or not: it simply prints "nan".
I've Googled and Googled, but everything I've found on GCC code
optimizations (like "-fno-signaling-nans") appears to simply quote the man
and info pages. I'm not finding a "dummy's" guide to picking code
optimization. Even the GCC Complete Reference book is very skimpy on the
details of code optimization.
The man page claims that "-ffast-math" may produce wrong results for
programs that depend on "an exact implementation of IEEE or ISO
rules/specifications for math functions."
What exactly does this vague sentence mean?
In my code, I do use errno, but it's for my own personal "die" function,
like when the program attempts to open a non-existent parameter file. I can
easily write errno out of my code.
I also enable and catch certain floating point exceptions which appear to be
disabled by default(!) like
* FE_DIVBYZERO division by zero
* FE_UNDERFLOW result not representable due to underflow
* FE_OVERFLOW result not representable due to overflow
* FE_INVALID invalid operation
Unlike errno, which is expendable, I would definitely like to be able to
catch these FPE's. There's a lot of powers of 10^-34 and 10^-11 and 10^-31
in my code, and even with the equation is scaled, I really do need the
program to come to a grinding halt when anything becomes inf or nan or with
underflows/overflows.
Lastly, after some experimentation, I found that it's actually the
combination of "-fno-math-errno -funsafe-math-optimizations" (both enabled
by -ffast-math) that really makes my code fly. I'm talking about a factor
of almost 400%!!!
But, here again, the documentation describes what
-funssafe-math-optimizations does (violate IEEE and ANSI standards, assumes
values are correct, optimizes the operation of the hardware FPU in
non-standard ways) but doesn't tell me what I really care about: under what
situations (that an educated layman like I will understand) will this option
generate incorrect output?
The number of optimization options is dizzying. Any help would be greatly
appreciated!
Thanks!!!
Pete
--
Every theory is killed sooner or later, but if the theory has good in it,
that good is embodied and continued in the next theory. -- Albert Einstein
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D
More information about the Gcc-help
mailing list