[PATCH 0/2] Initial support for AVX512FP16

Tue Jul 6 18:11:52 GMT 2021

On Tue, 6 Jul 2021, Hongtao Liu via Gcc-patches wrote:

> There may be inconsistent behavior between soft-fp and avx512fp16
> instructions if we emulate _Float16 w/ float .
>  i.e
>   1) for a + b - c where b and c are variables with the same big value
> and a + b is NAN at _Float16 and real value at float, avx512fp16
> instruction will raise an exception but soft-fp won't(unless it's
> rounded after every operation.)

There are at least two variants of emulation using float:

(a) Using the excess precision support, as on AArch64, which means the C 
front end converts the _Float16 operations to float ones, with explicit 
narrowing on assignment (and conversion as if by assignment - argument 
passing and return, casts, etc.).  Excess precision indeed involves 
different semantics compared to doing each operation directly in the range 
and precision of _Float16.

(b) Letting the expand/optabs code generate operations in a wider mode.  
My understanding is that the result should get converted back to the 
narrower mode after each operation (by the expand/optabs code / 
convert_move called by it generating such a conversion), meaning (for 
basic arithmetic operations) that the semantics end up the same as if the 
operation had been done directly on _Float16 (but with more truncation 
operations occurring than would be the case with excess precision support 
used).

>   2) a / b where b is denormal value and AVX512FP16 won't flush it to
> zero even w/ -Ofast, but when it's extended to float and using divss,
> it will be flushed to zero and raise an exception when compiling w/
> Ofast

I don't think that's a concern, flush to zero is well outside the scope of 
standards defining _Float16 semantics.

> So the key point is that the soft-fp and avx512fp16 instructions may
> do not behave the same on the exception, is this acceptable?

As far as I understand it, all cases within the standards will behave as 
expected for exceptions, whether pure software floating-point is used, 
pure hardware _Float16 arithmetic or one of the forms of emulation listed 
above.  (Where "as expected" itself depends on the value of 
FLT_EVAL_METHOD, i.e. whether excess precision is used for _Float16.)  
Flush to zero and trapping exceptions are outside the scope of the 
standards.  Since trapping exceptions is outside the scope of the 
standards, so is anything that distinguishes whether an arithmetic 
operation raises the same exception more than once or the order in which 
it raises different exceptions (e.g. the possibility of "inexact" being 
raised more than once, both by arithmetic on float and by narrowing from 
float to _Float16).

-- 
Joseph S. Myers
joseph@codesourcery.com