[PATCH 2/6] [i386] Enable _Float16 type for TARGET_SSE2 and above.

Richard Biener richard.guenther@gmail.com
Wed Aug 4 11:28:20 GMT 2021


On Wed, Aug 4, 2021 at 4:39 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Aug 2, 2021 at 2:31 PM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > gcc/ChangeLog:
> >
> >         * config/i386/i386-modes.def (FLOAT_MODE): Define ieee HFmode.
> >         * config/i386/i386.c (enum x86_64_reg_class): Add
> >         X86_64_SSEHF_CLASS.
> >         (merge_classes): Handle X86_64_SSEHF_CLASS.
> >         (examine_argument): Ditto.
> >         (construct_container): Ditto.
> >         (classify_argument): Ditto, and set HFmode/HCmode to
> >         X86_64_SSEHF_CLASS.
> >         (function_value_32): Return _FLoat16/Complex Float16 by
> >         %xmm0.
> >         (function_value_64): Return _Float16/Complex Float16 by SSE
> >         register.
> >         (ix86_print_operand): Handle CONST_DOUBLE HFmode.
> >         (ix86_secondary_reload): Require gpr as intermediate register
> >         to store _Float16 from sse register when sse4 is not
> >         available.
> >         (ix86_libgcc_floating_mode_supported_p): Enable _FLoat16 under
> >         sse2.
> >         (ix86_scalar_mode_supported_p): Ditto.
> >         (TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Defined.
> >         * config/i386/i386.h (VALID_SSE2_REG_MODE): Add HFmode.
> >         (VALID_INT_MODE_P): Add HFmode and HCmode.
> >         * config/i386/i386.md (*pushhf_rex64): New define_insn.
> >         (*pushhf): Ditto.
> >         (*movhf_internal): Ditto.
> >         * doc/extend.texi (Half-Precision Floating Point): Documemt
> >         _Float16 for x86.
> >         * emit-rtl.c (validate_subreg): Allow (subreg:SI (reg:HF) 0)
> >         which is used by extract_bit_field but not backends.
> >
[...]
>
> Ping, i'd like to ask for approval for the below codes which is
> related to generic part.
>
> start from ..
> > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> > index ff3b4449b37..775ee397836 100644
> > --- a/gcc/emit-rtl.c
> > +++ b/gcc/emit-rtl.c
> > @@ -928,6 +928,11 @@ validate_subreg (machine_mode omode, machine_mode imode,
> >       fix them all.  */
> >    if (omode == word_mode)
> >      ;
> > +  /* ???Similarly like (subreg:DI (reg:SF), also allow (subreg:SI (reg:HF))
> > +     here. Though extract_bit_field is the culprit here, not the backends.  */
> > +  else if (known_gt (regsize, osize) && known_gt (osize, isize)
> > +          && FLOAT_MODE_P (imode) && INTEGRAL_MODE_P (omode))
> > +    ;
> >    /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >       is the culprit here, and not the backends.  */
> >    else if (known_ge (osize, regsize) && known_ge (isize, osize))
>
> and end here.

So the main restriction otherwise in place is

  /* Subregs involving floating point modes are not allowed to
     change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
     (subreg:SI (reg:DF) 0) isn't.  */
  else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
    {
      if (! (known_eq (isize, osize)
             /* LRA can use subreg to store a floating point value in
                an integer mode.  Although the floating point and the
                integer modes need the same number of hard registers,
                the size of floating point mode can be less than the
                integer mode.  LRA also uses subregs for a register
                should be used in different mode in on insn.  */
             || lra_in_progress))
        return false;

I'm not sure if it would be possible to do (subreg:SI (subreg:HI (reg:HF)))
to "work around" this restriction.  Alternatively one could finally do away
with all the exceptions and simply allow all such subregs giving them
semantics as to intermediate same-size subregs to integer modes
if this definition issue is why we disallow them?

That is, any float-mode source or destination subreg is interpreted as
wrapping the source operand (if float-mode) in a same size int subreg
and performing the subreg in an integer mode first if the destination
mode is a float mode?

Also I detest that validate_subreg list things not allowed as opposed
to things allowed.  Why are FLOAT_MODE special, but
fractional and accumulating modes not?  The subreg documentation
also doesn't talk about cases not allowed.

Richard.


More information about the Gcc-patches mailing list