This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] -fftz-math: assume that denorms _must_ be flushed to zero optimizations
- From: Pekka Jääskeläinen <pekka at parmance dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: "Joseph S. Myers" <joseph at codesourcery dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Henry Linjamäki <henry dot linjamaki at parmance dot com>, Martin Jambor <mjambor at suse dot cz>
- Date: Mon, 14 Aug 2017 12:45:08 +0200
- Subject: Re: [PATCH] -fftz-math: assume that denorms _must_ be flushed to zero optimizations
- Authentication-results: sourceware.org; auth=none
- References: <CAJk11WCfMLenz=hume0aTTpkpJ6GM0bB3xsFL+dSvnA9zVPs+Q@mail.gmail.com> <CAFiYyc07+kScuyPZSe6oeVA4PROiLg4bb=zz95BaV_oXeGjtVw@mail.gmail.com>
The base idea of the patch is to optimize for the (common) situation
is controlled by a CPU-wide flag and we then need to only avoid compile-time
optimizations that assume semantics where denorm handling is on to support
the ‘forced FTZ/DAZ semantics’.
> This suggests only outputs are flushed to zero? OTOH documentation
> for X * 1 -> X suggests otherwise. This simplification also suggests to
> make FTZ operations explicit instead of adding a flag? Thus the BRIG
> FE would emit FTZ (X) * 1 which we can optimize to FTZ (X), and we
> could eventually add a pass optimizing FTZ operations?
Both the inputs and outputs must be flushed to zero in the HSAIL’s
FTZ operations were previously always “explicit” in the BRIG FE output, like you
propose here; there were builtin calls injected for all inputs and the
output of ‘ftz’-marked
float HSAIL instructions. This is still provided as a fallback for
targets which do not
support a CPU mode flag.
The problem with a special FTZ ‘operation’ of some kind in the generic output is
that the basic optimizations get confused by a new operation and we’d need to
add knowledge of the ‘FTZ’ operation to a bunch of existing optimizer
seems unnecessary to support this case as the optimizations typically apply also
for the ‘FTZ semantics’ when the FTZ/DAZ flag is on.