This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: What is acceptable for -ffast-math? A numerical viewpoint

Wolfgang Bangerth wrote:

> Due to repeated demand: here is the opinion of a numerical analyst.

And here's a second opinion from a numerical analyst (me).

> To further define my position: there is some kind of schism in the numerics
> society, the "traditionalists" doing Fortran with highly tuned codes and
> relatively simple data structures, and the "modernists" using extremely
> large C++ programs on complex data structures. I belong to the latter
> group and can only speak for them. I know that the former group has
> different views on these topics.

I'm also a member of the latter group.

> My opinion is that -fast-math could well include slight deviations from
> IEEE modes for denormals, rounding modes, associative redistribution, etc.

For me this is true also.

> The reasons being:
> - denormals: if you run into the risk of getting NaNs one way but not
>   the other, you most certainly already have gone beyond the limits of
>   your program, i.e. your linear solver diverged, you have chosen an
>   inherently unstable formulation for your problem, etc. In practice, if
>   you don't get you NaN on this operation, you will get it some operations
>   later, and if not then the result is spoilt anyway because of the
>   instability. I cannot imagine someone writing a code which runs on
>   a well-defined side of the boundary between normal and denormalized
>   values and still returns reasonable results. And even if such a program
>   existed, then that person must have known exactly what he does and will
>   refrain from -fast-math and use -mieee instead.

Agreed to.

> - rounding modes: if it makes a difference, then again you are already
>   in an instable regime.
>   Then why bother: I know that I only solve an
>   approximation of the true laws of nature (e.g. after discretizing a
>   partial differential equation), why solve that to 16 digits of accuracy?

I think you are mixing two different things here.

To make things a little more precise you have to look at the different
errors you're introducing when numerically solving a problem:

(1) Difference between (continuous) model and reality (including noisy
    signals etc.),
(2) discretization error (difference between the continuous and the finite
(3) round-off errors (all kinds of errors that stem from the fact
    that you have to approximate real numbers by floating point
    numbers with finite precision.

The point is: If the errors caused by (3) are smaller than the errors
in (1) and (2) then you're fine with -ffast-math.

With ill-conditioned problems this is not always the case.
You might get 2 or 3 correct digits out of your computation,
if you're are very careful with rounding. But you might lose
them if you don't do it right. But then: use -mieee.

> - association: Fortran codes differ significantly from C++ codes in
>   some ways, not the least are that C++ codes are often split into many
>   very small parts. In these cases, I have no control over how the
>   operands associate, if they come from different functions. For example,
>   if I have (a very much simplified situation)
>     double x, y;
>     double f () { return x; };
>     double g () { return y; };
>     double h () { return x; };
>     ...
>     double z = f() * g() / h();
>   then I'd be happy if I get z==y, although that might not be right
>   from an IEEE viewpoint, or if f()+g()+h() would be turned into
>   2*x+y. If you think that this example is unrealistic,
>   think about using the C++ valarray<> class with its small functions
>   combined with a case where some of the function's arguments are constant
>   and known so that most of the operations in the body of the functions
>   can be optimized away. If the compiler can help optimize such cases
>   across the boundaries of single functions, then that would be great, but
>   it will often require violating associative laws.
>   Note that this is a case where C++ differs significantly from Fortran:
>   there, you usually have large functions and optimizations take place
>   within these.

Agreed to.

In C++ you often don't want to tune your code for speed but choose
for readability instead. It would be nice if the compiler did the tuning.
Deciding whether the compiler can do this in an agressive manner or not
should be left to the user. He is the only one who can make that decision
depending on whether the additional errors are tolerable or not.
Those who cannot make this decision beforehand could stick to harmless
optimizations or just try the effects of -ffast-math. (They probably
wouldn't write code that produced tolerable results with IEEE and
intolerable without. So it's of no use to strictly stick to IEEE anyway.)

I opt for quite aggressive (and well documented) optimizations with
-ffast-math (with a big warning in the documentation). But -ffast-math
should not be turned on with -O2 by default. Maybe with -O3, but
I think it's best to leave it to the user completely.

Probably even better: Make different levels of optimization where one
can decide how much of 'sticking to IEEE' (I don't want to say 'security'
here) one is willing to pay for speed.

I see that this might be difficult if you want to compile a library.
Some people might want the fastest version some others the one that
sticks to IEEE. Having the source code of the library available you
can compile two versions if you need/want to. So even in that case
it's better to have the choice instead of having to stick to IEEE.
If you don't have the source code then you are dependant on what
the vendor chooses. But hopefully he will make the right decision
(you trusted him with other decisions, didn't you?).

My conclusion is: Give the user the option to switch on/off quite
aggressive optimizations/transformations whenever he pleases.

> So, concluding: if you have programs that run for several days, you'd be
> happy if you could cut that time by some hours using optimizations that
> - still do what common sense would dictate (IEEE is not pure common sense,
>   but rather a definition), i.e. swapping operands would be allowed, but
>   replacing NaNs by zero would not
> - can be switched off in case I know I have to treat boundary cases
>   exactly.
> I myself would certainly appreciate a compiler turning a*c+b*c into
> (a+b)*c, I'd definitely not be shocked about that.

Agreed to.

Volker Reichelt

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]