Re: [PATCH] -fftz-math: assume that denorms _must_ be flushed to zero optimizations

On Mon, Aug 14, 2017 at 12:45 PM, Pekka Jääskeläinen <> wrote:
> Hi Richard,
> The base idea of the patch is to optimize for the (common) situation
> where FTZ/DAZ
> is controlled by a CPU-wide flag and we then need to only avoid compile-time
> optimizations that assume semantics where denorm handling is on to support
> the ‘forced FTZ/DAZ semantics’.
>> This suggests only outputs are flushed to zero?  OTOH documentation
>> for X * 1 -> X suggests otherwise.  This simplification also suggests to
>> make FTZ operations explicit instead of adding a flag?  Thus the BRIG
>> FE would emit FTZ (X) * 1 which we can optimize to FTZ (X), and we
>> could eventually add a pass optimizing FTZ operations?
> Both the inputs and outputs must be flushed to zero in the HSAIL’s
> ‘ftz’ semantics.
> FTZ operations were previously always “explicit” in the BRIG FE output, like you
> propose here; there were builtin calls injected for all inputs and the
> output of ‘ftz’-marked
> float HSAIL instructions. This is still provided as a fallback for
> targets which do not
> support a CPU mode flag.

I see.  But how does making them implicit fix cases in the conformance
testsuite?  That is, isn't the error in the runtime implementation of
__hsail_ftz_*?  I'd have used a "simple"

  if (fpclassify (x) == FP_SUBNORMAL)
    return copysign (0, x);

> The problem with a special FTZ ‘operation’ of some kind in the generic output is
> that the basic optimizations get confused by a new operation and we’d need to
> add knowledge of the ‘FTZ’ operation to a bunch of existing optimizer
> code, which
> seems unnecessary to support this case as the optimizations typically apply also
> for the ‘FTZ semantics’ when the FTZ/DAZ flag is on.

Apart from the exceptions you needed to guard ... do you have an example of
a transform that is confused by explicit FTZ and that would be valid if that FTZ
were implicit?  An explicit FTZ should be much safer.  I think the builtins
should also be CONST and not only PURE.


> Thanks,
> Pekka

