[Bug middle-end/38960] Wrong floating point reorder

rguenther at suse dot de gcc-bugzilla@gcc.gnu.org
Tue May 8 07:52:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38960

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 4 May 2018, joseph at codesourcery dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38960
> 
> --- Comment #5 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> Since any non-const function can examine floating-point state, I'd expect 
> significant effects on code generation.  (Whether this also applies to 
> asms depends on the architecture; some architectures have a register name 
> you can use in asm operands to refer to floating-point state, and in those 
> cases asms reading or writing that state "should" say explicitly that they 
> do so, but I don't think all architectures have such a name supported by 
> GCC in asms.)

That's true.  GCCs job would then be to prove and IPA-propagate knowledge
of which functions actually do access FP state.

If it actually works (still needs to be proven by experiment) it is
still the simplest approach for "fixing" the issue.  If it works
a first enhancement would be to not re-use returns_twice but invent
a new attribute so we can do more careful abnormal edge creation.

An alternative fix could involve forcing all FP computation results
to (addressable aka aliasable) memory and make FP state accesses
also access all (FP?) memory.

Alternatively all FP ops could be "lowered" to internal functions
and thus basically hidden from the optimizers.  Dependences to
FP state accessors can be handled as memory dependence then.  This
lowering would be similar to what is proposed for a -ftrapv replacement.
The issue then remains on the RTL side though (but maybe we're lucky
and re-ordering doesn't happen there and/or we could expand suitable
barriers before and after possible FP state accesses).

Another alternative would be to try to model the FP state explicitely.
With the right infrastructure this would allow modeling other CPU
state (CC flags) in a similar way.

I think that the force-to-memory variant isn't really worth exploring
since it involves a lot of engineering with questionable benefit
over the "simple" solution(s).


More information about the Gcc-bugs mailing list