[PATCH 1/2] IBM Z: Store long doubles in vector registers when possible
Andreas Krebbel
krebbel@linux.ibm.com
Tue Nov 10 08:33:53 GMT 2020
On 09.11.20 20:54, Ilya Leoshkevich wrote:
> On z14+, there are instructions for working with 128-bit floats (long
> doubles) in vector registers. It's beneficial to use them instead of
> instructions that operate on floating point register pairs, because it
> allows to store 4 times more data in registers at a time, relieving
> register pressure. The raw performance of the new instructions is
> almost the same as that of the new ones.
>
> Implement by storing TFmode values in vector registers on z14+. Since
> not all operations are available with the new instructions, keep the
> old ones available using the new FPRX2 mode, and convert between it and
> TFmode when necessary (this is called "forwarder" expanders below).
> Change the existing TFmode expanders to call either new- or old-style
> ones depending on whether we are on z14+ or older machines
> ("dispatcher" expanders).
>
> gcc/ChangeLog:
>
> 2020-11-03 Ilya Leoshkevich <iii@linux.ibm.com>
>
> * config/s390/s390-modes.def (FPRX2): New mode.
> * config/s390/s390-protos.h (s390_fma_allowed_p): New function.
> * config/s390/s390.c (s390_fma_allowed_p): Likewise.
> (s390_build_signbit_mask): Support 128-bit masks.
> (print_operand): Support printing the second word of a TFmode
> operand as vector register.
> (constant_modes): Add FPRX2mode.
> (s390_class_max_nregs): Return 1 for TFmode on z14+.
> (s390_is_fpr128): New function.
> (s390_is_vr128): Likewise.
> (s390_can_change_mode_class): Use s390_is_fpr128 and
> s390_is_vr128 in order to determine whether mode refers to a FPR
> pair or to a VR.
> (s390_emit_compare): Force TFmode operands into registers on
> z14+.
> * config/s390/s390.h (HAVE_TF): New macro.
> (EXPAND_MOVTF): New macro.
> (EXPAND_TF): Likewise.
> * config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF
> alias.
> (ALL): Add FPRX2.
> (FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-.
> (FP): Likewise.
> (FP_ANYTF): New mode iterator.
> (BFP): Add FPRX2 for z14+, restrict TFmode to z13-.
> (TD_TF): Likewise.
> (xde): Add FPRX2.
> (nBFP): Likewise.
> (nDFP): Likewise.
> (DSF): Likewise.
> (DFDI): Likewise.
> (SFSI): Likewise.
> (DF): Likewise.
> (SF): Likewise.
> (fT0): Likewise.
> (bt): Likewise.
> (_d): Likewise.
> (HALF_TMODE): Likewise.
> (tf_fpr): New mode_attr.
> (type): New mode_attr.
> (*cmp<mode>_ccz_0): Use type instead of mode with fsimp.
> (*cmp<mode>_ccs_0_fastmath): Likewise.
> (*cmptf_ccs): New pattern for wfcxb.
> (*cmptf_ccsfps): New pattern for wfkxb.
> (mov<mode>): Rename to mov<mode><tf_fpr>.
> (signbit<mode>2): Rename to signbit<mode>2<tf_fpr>.
> (isinf<mode>2): Renamed to isinf<mode>2<tf_fpr>.
> (*TDC_insn_<mode>): Use type instead of mode with fsimp.
> (fixuns_trunc<FP:mode><GPR:mode>2): Rename to
> fixuns_trunc<FP:mode><GPR:mode>2<FP:tf_fpr>.
> (fix_trunctf<mode>2): Rename to fix_trunctf<mode>2_fpr.
> (floatdi<mode>2): Rename to floatdi<mode>2<tf_fpr>, use type
> instead of mode with itof.
> (floatsi<mode>2): Rename to floatsi<mode>2<tf_fpr>, use type
> instead of mode with itof.
> (*floatuns<GPR:mode><FP:mode>2): Use type instead of mode for
> itof.
> (floatuns<GPR:mode><FP:mode>2): Rename to
> floatuns<GPR:mode><FP:mode>2<tf_fpr>.
> (trunctf<mode>2): Rename to trunctf<mode>2_fpr, use type instead
> of mode with fsimp.
> (extend<DSF:mode><BFP:mode>2): Rename to
> extend<DSF:mode><BFP:mode>2<BFP:tf_fpr>.
> (<FPINT:fpint_name><BFP:mode>2): Rename to
> <FPINT:fpint_name><BFP:mode>2<BFP:tf_fpr>, use type instead of
> mode with fsimp.
> (rint<BFP:mode>2): Rename to rint<BFP:mode>2<BFP:tf_fpr>, use
> type instead of mode with fsimp.
> (<FPINT:fpint_name><DFP:mode>2): Use type instead of mode for
> fsimp.
> (rint<DFP:mode>2): Likewise.
> (trunc<BFP:mode><DFP_ALL:mode>2): Rename to
> trunc<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
> (trunc<DFP_ALL:mode><BFP:mode>2): Rename to
> trunc<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
> (extend<BFP:mode><DFP_ALL:mode>2): Rename to
> extend<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
> (extend<DFP_ALL:mode><BFP:mode>2): Rename to
> extend<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
> (add<mode>3): Rename to add<mode>3<tf_fpr>, use type instead of
> mode with fsimp.
> (*add<mode>3_cc): Use type instead of mode with fsimp.
> (*add<mode>3_cconly): Likewise.
> (sub<mode>3): Rename to sub<mode>3<tf_fpr>, use type instead of
> mode with fsimp.
> (*sub<mode>3_cc): Use type instead of mode with fsimp.
> (*sub<mode>3_cconly): Likewise.
> (mul<mode>3): Rename to mul<mode>3<tf_fpr>, use type instead of
> mode with fsimp.
> (fma<mode>4): Restrict using s390_fma_allowed_p.
> (fms<mode>4): Restrict using s390_fma_allowed_p.
> (div<mode>3): Rename to div<mode>3<tf_fpr>, use type instead of
> mode with fdiv.
> (neg<mode>2): Rename to neg<mode>2<tf_fpr>.
> (*neg<mode>2_cc): Use type instead of mode with fsimp.
> (*neg<mode>2_cconly): Likewise.
> (*neg<mode>2_nocc): Likewise.
> (*neg<mode>2): Likeiwse.
> (abs<mode>2): Rename to abs<mode>2<tf_fpr>, use type instead of
> mode with fdiv.
> (*abs<mode>2_cc): Use type instead of mode with fsimp.
> (*abs<mode>2_cconly): Likewise.
> (*abs<mode>2_nocc): Likewise.
> (*abs<mode>2): Likewise.
> (*negabs<mode>2_cc): Likewise.
> (*negabs<mode>2_cconly): Likewise.
> (*negabs<mode>2_nocc): Likewise.
> (*negabs<mode>2): Likewise.
> (sqrt<mode>2): Rename to sqrt<mode>2<tf_fpr>, use type instead
> of mode with fsqrt.
> (cbranch<mode>4): Use FP_ANYTF instead of FP.
> (copysign<mode>3): Rename to copysign<mode>3<tf_fpr>, use type
> instead of mode with fsimp.
> * config/s390/s390.opt (flag_vx_long_double_fma): New
> undocumented option.
> * config/s390/vector.md (V_HW): Add TF for z14+.
> (V_HW2): Likewise.
> (VFT): Likewise.
> (VF_HW): Likewise.
> (V_128): Likewise.
> (tf_vr): New mode_attr.
> (tointvec): Add TF.
> (mov<mode>): Rename to mov<mode><tf_vr>.
> (movetf): New dispatcher.
> (*vec_tf_to_v1tf): Rename to *vec_tf_to_v1tf_fpr, restrict to
> z13-.
> (*vec_tf_to_v1tf_vr): New pattern for z14+.
> (*fprx2_to_tf): Likewise.
> (*mov_tf_to_fprx2_0): Likewise.
> (*mov_tf_to_fprx2_1): Likewise.
> (add<mode>3): Rename to add<mode>3<tf_vr>.
> (addtf3): New dispatcher.
> (sub<mode>3): Rename to sub<mode>3<tf_vr>.
> (subtf3): New dispatcher.
> (mul<mode>3): Rename to mul<mode>3<tf_vr>.
> (multf3): New dispatcher.
> (div<mode>3): Rename to div<mode>3<tf_vr>.
> (divtf3): New dispatcher.
> (sqrt<mode>2): Rename to sqrt<mode>2<tf_vr>.
> (sqrttf2): New dispatcher.
> (fma<mode>4): Restrict using s390_fma_allowed_p.
> (fms<mode>4): Likewise.
> (neg_fma<mode>4): Likewise.
> (neg_fms<mode>4): Likewise.
> (neg<mode>2): Rename to neg<mode>2<tf_vr>.
> (negtf2): New dispatcher.
> (abs<mode>2): Rename to abs<mode>2<tf_vr>.
> (abstf2): New dispatcher.
> (float<mode>tf2_vr): New forwarder.
> (float<mode>tf2): New dispatcher.
> (floatuns<mode>tf2_vr): New forwarder.
> (floatuns<mode>tf2): New dispatcher.
> (fix_trunctf<mode>2_vr): New forwarder.
> (fix_trunctf<mode>2): New dispatcher.
> (fixuns_trunctf<mode>2_vr): New forwarder.
> (fixuns_trunctf<mode>2): New dispatcher.
> (<FPINT:fpint_name><VF_HW:mode>2<VF_HW:tf_vr>): New pattern.
> (<FPINT:fpint_name>tf2): New forwarder.
> (rint<mode>2<tf_vr>): New pattern.
> (rinttf2): New forwarder.
> (*trunctfdf2_vr): New pattern.
> (trunctfdf2_vr): New forwarder.
> (trunctfdf2): New dispatcher.
> (trunctfsf2_vr): New forwarder.
> (trunctfsf2): New dispatcher.
> (extenddftf2_vr): New pattern.
> (extenddftf2): New dispatcher.
> (extendsftf2_vr): New forwarder.
> (extendsftf2): New dispatcher.
> (signbittf2_vr): New forwarder.
> (signbittf2): New dispatchers.
> (isinftf2_vr): New forwarder.
> (isinftf2): New dispatcher.
> * config/s390/vx-builtins.md (*vftci<mode>_cconly): Use VF_HW
> instead of VECF_HW, add missing constraint, add vw support.
> (vftci<mode>_intcconly): Use VF_HW instead of VECF_HW.
> (*vftci<mode>): Rename to vftci<mode>, use VF_HW instead of
> VECF_HW, and vw support.
> (vftci<mode>_intcc): Use VF_HW instead of VECF_HW.
Ok. Thanks!
Andreas
More information about the Gcc-patches
mailing list