This is the mail archive of the
mailing list for the GCC project.
Re: patch to fix rtl documentation for new floating point comparisons
- From: Kenneth Zadeck <zadeck at naturalbridge dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>, Paolo Bonzini <bonzini at gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 18 Feb 2015 07:59:23 -0700
- Subject: Re: patch to fix rtl documentation for new floating point comparisons
- Authentication-results: sourceware.org; auth=none
- References: <54D9063D dot 1000102 at naturalbridge dot com> <alpine dot DEB dot 2 dot 10 dot 1502092223140 dot 32543 at digraph dot polyomino dot org dot uk> <54D95C03 dot 1090904 at naturalbridge dot com> <alpine dot DEB dot 2 dot 10 dot 1502102141060 dot 31125 at digraph dot polyomino dot org dot uk> <54DFAF8C dot 6000906 at gnu dot org> <54E0C471 dot 1090704 at naturalbridge dot com> <54E312E4 dot 707 at foss dot arm dot com> <alpine dot DEB dot 2 dot 10 dot 1502171155440 dot 26958 at digraph dot polyomino dot org dot uk> <54E3DA03 dot 5010901 at naturalbridge dot com> <alpine dot DEB dot 2 dot 10 dot 1502180029110 dot 26150 at digraph dot polyomino dot org dot uk> <C0E9884B-396F-4DC7-B7FE-973FFD72FF50 at naturalbridge dot com> <alpine dot DEB dot 2 dot 10 dot 1502180958160 dot 8477 at digraph dot polyomino dot org dot uk>
> On Feb 18, 2015, at 3:23 AM, Joseph Myers <firstname.lastname@example.org> wrote:
>> On Tue, 17 Feb 2015, Kenneth Zadeck wrote:
>> The fp exceptions raise some very tricky issues with respect to gcc and
>> optimization. On many machines, noisy does not mean to throw an
>> exception, it means that you set a bit and then check later. If you try
>> to model this kind of behavior in gcc, you end up pinning the code so
>> that nothing can be moved or reordered.
> When I say exception here, I'm always referring to that flag bit setting,
> not to processor-level exceptions. In IEEE 754 terms, an exception is
> *signaled*, and the default exception handling is to *raise* a flag and
> deliver a default result (except for exact underflow which doesn't raise
> the flag).
> To quote Annex F, "This specification does not require support for trap
> handlers that maintain information about the order or count of
> floating-point exceptions. Therefore, between function calls,
> floating-point exceptions need not be precise: the actual order and number
> of occurrences of floating-point exceptions (> 1) may vary from what the
> source code expresses.". So it is not necessary to be concerned about
> configurations where trap handlers may be called.
> There is as yet no public draft of TS 18661-5 (Supplementary attributes).
> That will provide C bindings for alternate exception handling as described
> in IEEE 754-2008 clause 8. I suspect such bindings will not readily be
> efficiently implementable using processor-level exception handlers; SIGFPE
> is an awkward interface for implementing such things at the C language
> level, some processors do not support such trap handlers at all (e.g. many
> ARM processors), and where traps are supported they may be asynchronous
> rather than occurring immediately on execution of the relevant
> instruction. In addition, at least x86 does not support raising exception
> flags without running trap handlers on the next floating-point instruction
> (raiseFlags operation, fesetexcept in TS 18661-1); that is, if trap
> handlers were used to implement standard functionality, it would need to
> be in a way such that this x86 peculiarity is not visible.
my point here is that what you want to be able to do is freely reorder the fp operations ( within the rules of reordering fp operations) between places were those bits are explicitly read or cleared. were have no way to model that chain of modify operations in gcc.
>> to get this right gcc needs something like a monotonic dependency which
>> would allow reordering and gcc has nothing like this. essentially, you
>> need way to say that all of these insns modify the same variable, but
>> they all just move the value in the same direction so you do not care
>> what order the operations are performed in. that does not mean that
>> this could not be added but gcc has nothing like this.
> Indeed, this is one of the things about defining the default mode that I
> referred to; the present default is -ftrapping-math, but we may wish to
> distinguish between strict trapping-math (whenever exception flags might
> be tested / raised / lowered, exactly the computations specified by the
> abstract machine have occurred, which might mean rather more limits on
> code movement in the absence of monotonic dependencies) and loose trapping
> math (like the present default; maybe don't transform expressions locally
> in ways that add or remove exceptions, but don't treat an expression as
> having side effects or reading global state purely because of possible
> raising of floating-point exceptions).
>> going back to the rounding modes issue, there is a huge range in the
>> architectural implementation space. you have a few that are pure
>> dynamic, a few that are pure static and some in the middle that are just
>> a mess. a lot of machines would have liked to support fully static, but
>> could not fit the bits to specify the rounding modes into the
>> instruction. my point here is you do need to at least have a plan that
>> will support the full space even if you do this with a 1000 small
> I think the norm is dynamic, because that's what was in IEEE 754-1985,
> with static rounding added more recently on some processors, because of
> IEEE 754-2008. (There are other variants - IA64 having multiple dynamic
> rounding mode registers and allowing instructions to specify which one the
> rounding mode is taken from.)
the first ieee standard only allowed the dynamic model. the second allows the static model. while dynamic is more common, there are/were architectures that are fully static. i believe that the first sparks were fully static and this was why the standard changed. ( i could be completely wrong on which arch was the first fully static). the private port that i am working on is currently fully static, but i am trying to change that. code generation of a dynamic program on a fully static machine is gruesome.
my point here is that there are fully static machines so do not do anything that precludes this.
also remember that constant prop on the rounding mode can be a win. without knowing the rounding mode precisely, you cannot really do constant prop on the data. also the constant prop on the rounding mode can let you avoid a lot of code which sets that register. this can be important if the machine requires a cycle or two to settle that setting before the next fp operation.
> Joseph S. Myers