This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: What is acceptable for -ffast-math? (Was: associative law in combine)



> The _only_ person who can judge whether it's wrong to do so it the person
> who wrote the program. He may be happy with the faster code. He's almost
> certainly happy if he used "-ffast-math".

Well it sure would be nice to here from some of these mythical numerical
programmers (I don't care if they are writing games or nuclear reactor codes)
who would be happier, so far we haven't heard this! And in my experience,
even quite inexperienced floating-point numerical programmers are very
disturbed when optimization changes the results of their programs.

> Your arguments about "numerical computation" are just silly, as you don't
> seem to realize that there are tons of problems where your theoretical
> issues are nothing more than noise.

If you think the arguments are silly, then I really fear you lack the full
context for this discussion, a discussion that has, as you should know raged
for well over thirty years.

> Go back and look at why it was added. Right - exactly so that people could
> get _fast_ math, when they do not care about the inexactness that isn't
> even an issue for most people.

Sure -ffast-math is precisely intended to allow transformations that would
not otherwise be allowed (let's not call them optimizations, that's just
too controversial a word in the context of this argument).

The question is what is the boundary of allowable transformations. No one
agrees that there should be no boundaries (even you don't like the change
results to zero, though note that abandoning denormals has exactly this
effect, and might be considered acceptable).

So, what is the boundary, can one for instance forget about denormals and
flush to zero to save a bit of time, can one truncate instead of round,
can one ignore negative zeroes, or infinity semantics, can one ignore
intermediate overflow (note: nearly all the discussed transformations are
implicitly saying yes to this last question).

I have not seen anyone writing from the point of view of serious numerical
coding saying "sure, go ahead and transform a*b + a*c, I write that all
the time and would be happy if the compiler would speed it up in the
obvious manner". So there is real doubt as to whether this particular
transformation (which can introduce overflow where none existed before)
is one that we can agree to include.

It seems to me that the criteria for including a transformation are

  a) it is well understood and clear
  b) it really makes a difference in performance
  c) it does not introduce surprises

Unfortunately, these are often in conflict. For instance, in C, we usually
degrade the precision of the x86 floating-point unit to minimize surprises,
even though operating in full precision would be more efficient in some
cases. Should -ffast-math allow full precision operation? I would think so,
since it definitely improves performance, and reduces surprises.

Similarly, I would go for ignoring accuracy in negative zero handling, since
that so clearly meets criteria a) and c). 

But really it would be very valuable to have quantitative data that show that
a particular optimization is worth while. I will say again that the particular
case that has been discussed, which is associative law redistribution, has not
carried that burden for me AT ALL. It is a dubious transformation, violating
condition c) above, and you can see that it is dubious, given the reaction of
many, not just me, on this list. Furthermore, promoting something like this
gives an impression to serious numerical analysts that "those GCC guys" simply
don't understand the requirements of floating-point arithmetic, and that's a
cost one does not want to freely pay unless you are sure you are getting 
something in return.

We can't look to the standard here, because the question is one of taste. 
Even in Ada, in relaxed mode (the equivalent of -ffast-math, but acknowledged
and defined somewhat in the stadard), the standard leaves these kind of
decisions to the implementor. However, I cannot imagine any Ada implementor
being as cavalier as you would suggest.

By the way, I said I would be shocked to find a Fortran compiler that did
associative redistribution in the absence of parens. I am somewhat surprised
that no one stepped forward with a counter-example, but I suspect in fact that
there may not be any shocking Fortran implementations around.

It is an old argument, the one that says that fpt is approximate, so why bother
to be persnickety about it. Seymour Cray always tool this viewpoint, and it
did not bother him that 81.0/3.0 did not give exactly 27.0 on the CDC 6000
class machines.

But the advent, and essentially universal adoption, of IEEE arithmetic, really
signifies that Seymour's viewpoint (and yours in this discussion) have lost the
argument and that indeed we want floating-point to be well behaved even though
we know many programmers won't know what this means. Our experience is that
even these inexperienced programmers gain from a better floating-point model,
and that, surprise! the cost in hardware to give such a model turns out to be
minimal.

Now compiler writers always get carried away with the supposed importance
of miscellaneous optimizations, and every compiler writer has the experience
of working hard on some optimization, only to find that it has little or
no effect.

If indeed we could show that associative redistribution had a large effect
on important programs (be they games, SPEC tests, or nuclear reactor codes),
then we would have a different situation from the hardware folks. But I suspect
that careful measurement would show that we are in the same boat, and that the
cost of providing decent floating-point semantics is too small for it to be
worth worrying about very much.


Robert


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]