[PATCH 02/10] [i386] Enable _Float16 type for TARGET_SSE2 and above.

Hongtao Liu crazylht@gmail.com
Mon Aug 2 05:23:35 GMT 2021


On Fri, Jul 30, 2021 at 5:30 AM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Thu, 29 Jul 2021, Hongtao Liu via Gcc-patches wrote:
>
> > > Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2
> > > (i.e. whenever the type is available), it might make more sense to follow
> > > AArch64 and use it only when the hardware instructions are available.  In
> > > any case, it seems peculiar to use a different threshold in the "fast"
> >   We want to provide some debuggability to the software emulation.
> > When there's inconsistency between software emulation and hardware
> > instructions, users can still debug on non-avx512fp16 processor w/
> > software emulation and extra option -fexcess-precision=standard,
>
> But that's not the purpose of -fexcess-precision=standard.  The purpose is
> only: when the default case is non-conforming, make it conforming instead.
> The default case is non-conforming only when the back end has insn
> patterns pretending to be able to do arithmetic on formats it can't
> actually do arithmetic on - that is, x87 arithmetic where the insn
> patterns pretend to support SFmode and DFmode arithmetic but actually use
> XFmode (and the similar issue for older m68k, but that back end doesn't
> actually have the required support for -fexcess-precision=standard).
>
> So -fexcess-precision=standard should not do anything different from
> -fexcess-precision=fast regarding _Float16.
>
It make perfect sense.
> If you want to be able to enable or disable excess precision for _Float16
> separately from the underlying hardware support, that might provide a case
> for supporting extra options, say -fexcess-precision=16 that means follow
> the semantics of FLT_EVAL_METHOD == 16 (and with an error for that option
> on architectures where the given FLT_EVAL_METHOD value isn't supported).
> But that shouldn't be done by making -fexcess-precision=standard do
> something outside its scope.
>
> > Also since TARGET_C_EXCESS_PRECISION is not related to type, for
> > testcase w/o _Float16 and is supposed to be runned on x86 fpu, if gcc
> > is built w/ --with-arch=sapphirerapid, it will regress those
> > testcases. .i.e. gcc.target/i386/excess-precision-*.c, that's why we
> > can't follow AArch64.
>
> Those tests use -mfpmath=387.
>
> In the -mfpmath=387 case, it seems reasonable to keep the rule of
> promoting to long double, regardless of hardware _Float16 support (-msse2
> must also be in effect for the type to be supported at all by the back
> end).  It's the -mfpmath=sse case for which I think following AArch64 is
> appropriate.
So does this.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com

I'll add an extra option -fexcess-precision=16 to set
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.
Also and refine ix86_get_excess_precision as

@@ -23327,14 +23382,18 @@ ix86_get_excess_precision (enum
excess_precision_type type)
  /* The fastest type to promote to will always be the native type,
     whether that occurs with implicit excess precision or
     otherwise.  */
- return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+ return TARGET_AVX512FP16
+        ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
+        : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
       case EXCESS_PRECISION_TYPE_STANDARD:
       case EXCESS_PRECISION_TYPE_IMPLICIT:
  /* Otherwise, the excess precision we want when we are
     in a standards compliant mode, and the implicit precision we
     provide would be identical were it not for the unpredictable
     cases.  */
- if (!TARGET_80387)
+ if (TARGET_AVX512FP16 && TARGET_SSE_MATH)
+   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
+ else if (!TARGET_80387)
    return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
  else if (!TARGET_MIX_SSE_I387)
    {

Will update in my next version.

-- 
BR,
Hongtao


More information about the Gcc-patches mailing list