The attached code is compiled incorrectly with every GCC version I could try. The code tries to do a floating point multiplication with a non-default rounding mode and then switches the rounding mode back to normal before checking the result. The result is wrong even if -frounding-math is specified on the command line. I tracked it down to tree-ssa-ter used during RTL expansion. The multiplication is detected to be a replaceable statement and is thus emitted after the floating point rounding mode has been reset back to normal. That simple hack fixes it : --- a/gcc/tree-ssa-ter.c +++ b/gcc/tree-ssa-ter.c @@ -429,8 +429,10 @@ is_replaceable_p (gimple stmt, bool ter) && !is_gimple_val (gimple_assign_rhs1 (stmt))) return false; - /* Float expressions must go through memory if float-store is on. */ - if (flag_float_store + /* Float expressions must go through memory if float-store is on. + Also, don't move float operations around if we're not sure the + rounding mode never changes. */ + if ((flag_float_store || flag_rounding_math) && FLOAT_TYPE_P (gimple_expr_type (stmt))) return false; However it's clearly not the optimal solution, as we might be able to detect that the rounding didn't change. AFAIU, tree-ssa-ter doesn't allow to replacements spanning calls, so it might be better to extend that logic to consider asm and builtins as replacement barriers when in flag_rounding_math mode. What do you think?
Created attachment 23779 [details] Failing code
Code motion optimizations (or even CSE) have no idea about dynamic rounding mode changes, -frounding-math does not change this (this switch is only guarding expression simplifications that usually assume round-to-nearest behavior). The only mode for which I'd accept such band-aid patches is -O0. Anyway, confirmed, but long-time known with no idea when or even how this is going to be fixed. Btw, your testcase would be kindof invalid as you are not using the documented standard way of accessing fenv but using inline-asm (and we don't have a special clobber that tells GCC you touched the FP control word). As for CSE, try /* Round to upward. */ tem1 = a + b; /* Round to nearest. */ tem2 = a + b; diff = tem1 - tem2; and see it optimized to zero at compile-time. I suppose there is a duplicate bug somewhere about the general issue.
In bug #34678 I thought you were agreeing with Joseph that -frounding-math had a more general meaning than the one you are expressing here, but I get your point that a more general solution should be implemented. Regarding the validity of the testcase, I took glibc's implementation of fesetround that I inlined. I did that because I saw that /usr/include/fenv.h has provisions for including inline versions of the fenv routines, although it's not the case yet for x86. So I guess it's possible that someday you get exactly that situation even though people are using the standard routines to switch rounding mode. In fact my private port does exactly this. Can you think of other places than CSE that would exhibit these issues? Unfortunately, I have to put some band-aids into place for our numerical analysts to be able to do their jobs.
On Mon, 28 Mar 2011, rguenth at gcc dot gnu.org wrote: > Btw, your testcase would be kindof invalid as you are not using the > documented standard way of accessing fenv but using inline-asm (and > we don't have a special clobber that tells GCC you touched the FP > control word). You mark the asm as reading and clobbering fpcr - that register name already exists in GCC, and it's a bug that glibc's fpu_control.h doesn't use it, though I suppose you'd need to add the "mxcsr" name as well. When -frounding-math is actually properly implemented it ought to be taught about target-specific registers representing rounding modes (and likewise exception status) so it knows how asms interact with the floating-point state. (Non-const, non-pure function calls will also need to be presumed to interact with the state in unknown ways since they may end up calling fenv.h functions.)
. *** This bug has been marked as a duplicate of bug 34678 ***