This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch to fix rtl documentation for new floating point comparisons

> On Feb 17, 2015, at 6:01 PM, Joseph Myers <> wrote:
>> On Tue, 17 Feb 2015, Kenneth Zadeck wrote:
>>> On 02/17/2015 07:05 AM, Joseph Myers wrote:
>>>> On Tue, 17 Feb 2015, Richard Earnshaw wrote:
>>>> So the problem we have today is the compiler has no way to distinguish
>>>> between, say, < and __builtin_isless.  According to Annex F (c99) the
>>>> former should be signalling while the latter quiet.
>>> We do have a way: < is LT and __builtin_isless is !UNGE.
>>> __builtin_islessgreater is !UNEQ.  The question is whether it's also LTGT
>>> or whether LTGT means LT || GT.  And the existing documentation of
>>> LTGT_EXPR leaves this unspecified, which seems clearly unhelpful.  Either
>>> way, you have existing code in GCC that's incorrect (i.e. that does not
>>> correspond to the set of transformations that are actually valid for the
>>> chosen semantics).
>> Having spent the majority of my gcc life at the rtl level, i find it
>> astonishing that the designers of the tree level allowed this machine
>> dependency to get into what i was always told was a pristine machine
>> independent pass.    i would have thought that the at the tree level this
>> could/should be nailed down to being quiet and if someone wanted to support a
>> noisy version in their port, then they would look for lt || gt when converting
>> to the rtl level which has never been so pristine.
> As I said in <>, I 
> can't find any posting to gcc-patches of the r82467 commit that introduced 
> the "possible exception" wording.
>> it would be nice to know exactly how many ports (if any) actually have a noisy
>> version of this in hardware. ltgt is not a common primitive (it is not even
>> mentioned in the ieee fp standards).
> For example, on MIPS the C.cond.fmt instruction has a four-bit condition 
> field: "In the cond field of the instruction: cond 2..1 specify the nature 
> of the comparison (equals, less than, and so on); cond 0 specifies whether 
> the comparison is ordered or unordered, i.e. false or true if any operand 
> is a NaN; cond 3 indicates whether the instruction should signal an 
> exception on QNaN inputs, or not".  Together with possibly negating the 
> result you get all 32 possible comparisons (choice of whether the 
> comparison is true or false for each of = < > unordered, choice of whether 
> to raise invalid for quiet NaNs).
This is a pretty weak motivation.    Computer architects love this kind of thing, assuming they have the opcode space.   Just give them every possible combination and let the programmers decide what is useful -  and by doing that, the architect saves a couple of muxes and gate delays.    But that doesn't mean that the layers of software need to support all of this,   Especially  , in this case when there is not motivation from either the fp standards or the language standards.
>>> I think the main difficulty in proper Annex F support would be making
>>> optimizers (on each IR) understand the side-effects operations have in
>>> terms of raising exceptions, and how operations may take the rounding mode
>>> or existing exceptions raised as inputs - with an associated issue of
>>> defining the existing default floating-point rules well enough to keep a
>>> default mode that doesn't unduly inhibit optimization.
>> i have been thinking a lot about the rounding modes.    the only way that
>> could think about properly supporting them was to actually add new rtl and
>> tree operations that take the rounding mode as an extra parameter.    i think
>> that this is going to be the only way to support both the static and dynamic
>> models.   This will be ugly, but having just finished the wide int patch it is
>> natural to say "how bad could it be?"   some ports will want to support this
>> and some will not.
> My starting point is to presume that any port with hardware floating-point 
> exceptions and rounding modes should support this.  But since ports would 
> probably need to change anyway to ensure operations do raise the right 
> exceptions, and to ensure that the machine-independent compiler can tell 
> which registers referred to in inline asm are part of the floating-point 
> exceptions / rounding modes state, maybe it wouldn't be so bad if they 
> also need to change their instruction descriptions to describe the 
> involvement of exceptions and rounding modes explicitly.  (You do need to 
> handle the case of exceptions and rounding modes with software floating 
> point, i.e. libcalls implicitly using them; this applies to powerpc-linux 
> soft float at least.)
> Of course many target architecture ports are for architectures without 
> hardware floating point, and without any exception or rounding mode 
> support, and these issues don't arise for them.
> (When I say above certain things are the main difficulty I mean they are 
> the only things with real design issues that I see.  Lots of other issues 
> arise such as:
> (a) writing thorough testcases for individual operations working properly 
> with exceptions and rounding modes in various contexts (I could probably 
> do that at least for basic arithmetic in a week or two; properly it should 
> be done for every IEEE operation for which GCC allows a built-in function 
> to be expanded inline);
> (b) going through transformations to review whether they are correct for 
> exceptions and rounding modes (a good starting point would be all 
> flag_trapping_math, many of which are likely not checking quite the right 
> condition, although transformations with no such checks are harder to 
> find);
> (c) avoiding spurious exceptions from libgcc functions e.g. converting 
> floating-point to DImode;
> (d) ensuring the right exceptions from converting floating-point to 
> bit-fields;
> (e) converting the standard pragmas to appropriate IR describing what 
> transformations are permitted on what code;
> (f) if you do Annex G as well, the ABI for _Imaginary argument passing.
> When you get into static rounding modes, as described in TS 18661-1, there 
> are a few more issues; e.g.:
> (g) supporting both architectures that can encode rounding modes in 
> instructions, and those where the rounding mode needs swapping at runtime;
> (h) allowing library headers to define macros for standard library 
> functions that cause them to use the constant rounding mode;
> (i) ensuring that <float.h> macros have the correct value in all constant 
> rounding modes, which is easy if they use hex floats, but if you allow the 
> combination of C90 mode with constant rounding modes then you need long 
> decimal expansions of those macros, so maybe that combination should be 
> disallowed.
> Those are examples of miscellaneous issues that I generally expect could 
> be addressed by lots of incremental patches without tricky overall design 
> issues.  More would probably come up in a full analysis / design process 
> of how to do a high-quality implementation of C99/C11 Annex F/G and TS 
> 18661-1.)
The fp exceptions raise some very tricky issues with respect to gcc and optimization.     On many machines, noisy does not mean to throw an exception, it means that you set a bit and then check later.    If you try to model this kind of behavior in gcc, you end up pinning the code so that   nothing can be moved or reordered.   

to get this right gcc needs something like a monotonic dependency which would allow reordering     
and gcc has nothing like this.  essentially, you need way to say that all of these insns modify the same variable, but they all just move the value in the same direction so you do not care what order the operations are performed in.   that does not mean that this could not be added but gcc has nothing like this.

however, even on the machines where you throw an exception, most people will at least like an option where the exception can be out of order.    

going back to the rounding modes issue, there is a huge range in the architectural implementation space.  you have a few that are pure dynamic, a few that are pure static and some in the middle that are just a mess.   a lot of machines would have liked to support fully static, but could not fit the bits to specify the rounding modes into the instruction.   my point here is you do need to at least have a plan that will support the full space even if you do this with a 1000 small patches.
> -- 
> Joseph S. Myers

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]