In 482.sphinx3 we have code like float foo (float x, float y) { return ((int)(x/y + 0.5)) * y; } where the addition of 0.5 is performed in double precision. With -funsafe-math-optimizations we can demote 0.5 to single precision (its exactly representable) also because the result of the addition does not take part of further floating point computation but is immediately converted to int. The unsafe part of this optimization occurs when x/y is FLT_MAX and we'd truncate to a 64bit integer type. For 32bit integers it would probably be safe to do this optimization unconditionally.

On Fri, 4 Mar 2011, rguenth at gcc dot gnu.org wrote: > In 482.sphinx3 we have code like > > float foo (float x, float y) > { > return ((int)(x/y + 0.5)) * y; > } > > where the addition of 0.5 is performed in double precision. With > -funsafe-math-optimizations we can demote 0.5 to single precision > (its exactly representable) also because the result of the addition > does not take part of further floating point computation but is > immediately converted to int. > > The unsafe part of this optimization occurs when x/y is FLT_MAX > and we'd truncate to a 64bit integer type. For 32bit integers > it would probably be safe to do this optimization unconditionally. No, that's not safe unconditionally; consider x/y == 0x800001p0f, for example (so adding 0.5f and converting to float results in rounding up to the next integer in the default round-to-nearest mode, whereas conversion from floating point to integer in C always rounds towards zero).

The intel compiler does not perform this optimization even at -fast. It does perform the demotion on float foo (float x, float y) { return (int)((float)(x/y + 0.5)) * y; } though, even with default optimization (also with the conversion to int removed or associated to apply to the first operand of the multiplication only). So they leave alone what looks like a usual "rounding" pattern. My original idea was to fold (int)((double)(x/y) + 0.5) to (int)(x/y + 0.5f), similar to (float)((double)(x/y) + 0.5) to (x/y + 0.5f) which we already do (at -O0, in convert_to_real).