[Bug rtl-optimization/50904] [4.7 regression] pessimization when -fno-protect-parens is enabled by -Ofast

Fri Dec 2 17:07:00 GMT 2011

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50904

--- Comment #34 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-12-02 17:06:57 UTC ---
(In reply to comment #31)
> Ok, which is, I suppose, a bug in both compilers.

Kind of, though, -ffast-math by itself already is on the verge of violating the
standard. I think -fno-protect-parens could be enabled by -ffast-math as that
means that one does not really care about the exact value.

However, there are users which want to have one or the other. Namely, most
users are happy with a default -fno-protect-parens but don't dare to use
-ffast-math, while others want to have -ffast-math optimizations but with
honored parentheses.

If we want to add add extra protection for
     tem = x + 2.d0**52
     x = tem - 2.d0**52
we probably need to add yet another flag as there are surely users, which want
to have protected parentheses but allow for optimizations in the 'tmp' case.
[Even if, as this PR shows, the extra optimization opportunity might lead to a
missed opportunity.] In any case, handling that well for function calls,
inlining and the scalarizer seems to be difficult. And frankly, I am not sure
whether there is any user; -ffast-math plus -fprotect-parens is already special
(cf. comment 32 for one user). Having -ffast-math plus parentheses plus
protected assignments might have even fewer users.

I believe most users simply use -O2, -O3 [-ffast-math], or -Ofast without
thinking (very) much about the options. [I also use typically either -O2, -O3
or -Ofast.]

* * *

Back to the comment 0 issue: I still do not quite understand what the double
evaluation (on tree level) of __builtin_pow in
  D.1959_82 = ((D.2115_81));
  D.1960_83 = __builtin_pow (D.1959_82, 2.0e+0);
  D.1978_168 = __builtin_pow (D.2115_81, 2.0e+0);
has to do with the -Ofast slow down. If I have understood it correctly, on tree
level, there is no reason for it while the slow-down happens on RTL level. That
-fprotect-parens makes it faster is a mere coincidence. Is that a correct rough
summary?