[Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare
will schmidt
will_schmidt@vnet.ibm.com
Thu Aug 13 23:46:05 GMT 2020
- Previous message (by thread): [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare
- Next message (by thread): [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
On Tue, 2020-08-11 at 12:22 -0700, Carl Love wrote:
> Segher, Will:
>
> Patch 2, adds support for divide, modulo, shift, compare of 128-bit
> integers. The support adds the instruction and builtin support.
>
> Carl Love
>
>
> -------------------------------------------------------
> rs6000, 128-bit multiply, divide, shift, compare
>
> gcc/ChangeLog
>
> 2020-08-10 Carl Love <cel@us.ibm.com>
> * config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define
> for new builtins .
Looks like there is also a change to the parameters for vec_rlnm(a,b,c)
here.
> * config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD,
> UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs.
ok
> (altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud,
> altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq,
> altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm,
> altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq,
> altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New
> define_insn.
> (vec_widen_umult_even_v2di, vec_widen_smult_even_v2di,
> vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi,
> altivec_vrlqnm): New define_expands.
Also a whitespace fix in there.
ok.
> * config/rs6000/rs6000-builtin.def (BU_P10_P, BU_P10_128BIT_1,
> BU_P10_128BIT_2, BU_P10_128BIT_3): New macro definitions.
Is this consistent with the other recent changes that reworked some of
those macro definition names?
> (VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions.
> (VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI,
> CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P,
> VCMPAET_P): New macro expansions.
> (VSIGNEXTSD2Q,VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ,
comma+space
> VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI,
> MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions.
> (VRLQ, VSLQ, VSRQ, VSRAQ, SIGNEXT): New overload expansions.
DIVE, MOD missing.
> * config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT,
> P10_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPGE_1TI,
Duplication of P10_BUILTIN_VCMPEQUT.
> P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
> P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
missing P10_BUILTIN_VCMPLE_U1TI
> P10_BUILTIN_128BIT_DIV_V1TI, P10_BUILTIN_128BIT_UDIV_V1TI,
> P10_BUILTIN_128BIT_VMULESD, P10_BUILTIN_128BIT_VMULEUD,
> P10_BUILTIN_128BIT_VMULOSD, P10_BUILTIN_128BIT_VMULOUD,
> P10_BUILTIN_VNOR_V1TI, P10_BUILTIN_VNOR_V1TI_UNS,
> P10_BUILTIN_128BIT_VRLQ, P10_BUILTIN_128BIT_VRLQMI,
> P10_BUILTIN_128BIT_VRLQNM, P10_BUILTIN_128BIT_VSLQ,
> P10_BUILTIN_128BIT_VSRQ, P10_BUILTIN_128BIT_VSRAQ,
> P10_BUILTIN_VCMPGTUT_P, P10_BUILTIN_VCMPGTST_P,
> P10_BUILTIN_VCMPEQUT_P, P10_BUILTIN_VCMPGTUT_P,
> P10_BUILTIN_VCMPGTST_P, P10_BUILTIN_CMPNET,
> P10_BUILTIN_VCMPNET_P, P10_BUILTIN_VCMPAET_P,
> P10_BUILTIN_128BIT_VSIGNEXTSD2Q, P10_BUILTIN_128BIT_DIVES_V1TI,
> P10_BUILTIN_128BIT_MODS_V1TI, P10_BUILTIN_128BIT_MODU_V1TI):
> New overloaded definitions.
> (int_ftype_int_v1ti_v1ti) [P10_BUILTIN_VCMPEQUT,
> P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI,
> P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
> P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
> P10_BUILTIN_CMPLE_U1TI, E_V1TImode]: New case statements.
Those are part of (rs6000_gimple_fold_builtin).
Also may be worth a sniff check of the generated code to ensure the
folding behaves properly.
> (int_ftype_int_v1ti_v1ti) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]:
> New assignments.
ok.
missing (altivec_init_builtins): Add E_V1TImode case.
> (int_ftype_int_v1ti_v1ti)[P10_BUILTIN_128BIT_VMULEUD,
> P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI,
> P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI,
> P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements.
Those are part of (builtin_function_type).
> * config/rs6000/r6000.c (rs6000_builtin_mask_calculate): New
> TARGET_TI_VECTOR_OPS definition.
> (rs6000_option_override_internal): Add if TARGET_POWER10 statement.
comment below.
> (rs6000_handle_altivec_attribute)[ E_TImode, E_V1TImode]: New case
> statements.
> (rs6000_opt_masks): Add ti-vector-ops entry.
ok.
> * config/rs6000/r6000.h (MASK_TI_VECTOR_OPS, RS6000_BTM_P10_128BIT,
> RS6000_BTM_TI_VECTOR_OPS, bool_V1TI_type_node): New defines.
> (rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI.
> * config/rs6000/rs6000.opt: New mti-vector-ops entry.
comment below.
> * config/rs6000/vector.md (vector_eqv1ti, vector_gtv1ti,
> vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti,
> vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p,
> vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3,
> vlshrv1ti3, vashrv1ti3): New define_expands.
ok
> * config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ,
> UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ,
> UNSPEC_VSX_MODUQ, UNSPEC_XXSWAPD_V1TI): New unspecs.
comment below.
> (vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti,
> vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti, vsx_sign_extend_v2di_v1ti):
> New define_insns.
> (vcmpnet): New define_expand.
> * gcc/doc/extend.texi: Add documentation for the new builtins vec_rl,
> vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo,
> vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt,
> vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt,
> vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt,
> vec_any_ge, vec_any_le.
comment below.
>
> gcc/testsuite/ChangeLog
>
> 2020-08-10 Carl Love <cel@us.ibm.com>
> * gcc.target/powerpc/int_128bit-runnable.c: New test file.
> ---
> gcc/config/rs6000/altivec.h | 6 +-
> gcc/config/rs6000/altivec.md | 242 +-
> gcc/config/rs6000/rs6000-builtin.def | 77 +
> gcc/config/rs6000/rs6000-call.c | 150 +-
> gcc/config/rs6000/rs6000.c | 17 +-
> gcc/config/rs6000/rs6000.h | 6 +-
> gcc/config/rs6000/rs6000.opt | 4 +
> gcc/config/rs6000/vector.md | 199 ++
> gcc/config/rs6000/vsx.md | 99 +-
> gcc/doc/extend.texi | 174 ++
> .../gcc.target/powerpc/int_128bit-runnable.c | 2254 +++++++++++++++++
The path into the testsuite subdir looks strange there.
> 11 files changed, 3217 insertions(+), 11 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>
> diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
> index 09320df14ca..a121004b3af 100644
> --- a/gcc/config/rs6000/altivec.h
> +++ b/gcc/config/rs6000/altivec.h
> @@ -183,7 +183,7 @@
> #define vec_recipdiv __builtin_vec_recipdiv
> #define vec_rlmi __builtin_vec_rlmi
> #define vec_vrlnm __builtin_vec_rlnm
> -#define vec_rlnm(a,b,c) (__builtin_vec_rlnm((a),((c)<<8)|(b)))
> +#define vec_rlnm(a,b,c) (__builtin_vec_rlnm((a),((b)<<8)|(c)))
per above. I don't see this change called out.
> #define vec_rsqrt __builtin_vec_rsqrt
> #define vec_rsqrte __builtin_vec_rsqrte
> #define vec_signed __builtin_vec_vsigned
> @@ -694,6 +694,10 @@ __altivec_scalar_pred(vec_any_nle,
> #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0)
>
> #ifdef _ARCH_PWR10
> +#define vec_signextq __builtin_vec_vsignextq
> +#define vec_dive __builtin_vec_dive
> +#define vec_mod __builtin_vec_mod
> +
> /* May modify these macro definitions if future capabilities overload
> with support for different vector argument and result types. */
> #define vec_cntlzm(a, b) __builtin_altivec_vclzdm (a, b)
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 0a2e634d6b0..2763d920828 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -39,12 +39,16 @@
> UNSPEC_VMULESH
> UNSPEC_VMULEUW
> UNSPEC_VMULESW
> + UNSPEC_VMULEUD
> + UNSPEC_VMULESD
> UNSPEC_VMULOUB
> UNSPEC_VMULOSB
> UNSPEC_VMULOUH
> UNSPEC_VMULOSH
> UNSPEC_VMULOUW
> UNSPEC_VMULOSW
> + UNSPEC_VMULOUD
> + UNSPEC_VMULOSD
> UNSPEC_VPKPX
> UNSPEC_VPACK_SIGN_SIGN_SAT
> UNSPEC_VPACK_SIGN_UNS_SAT
> @@ -628,6 +632,14 @@
> "vcmpequ<VI_char> %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "altivec_eqv1ti"
> + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v")
> + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v")
> + (match_operand:V1TI 2 "altivec_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpequq %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_gt<mode>"
> [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
> (gt:VI2 (match_operand:VI2 1 "altivec_register_operand" "v")
> @@ -636,6 +648,14 @@
> "vcmpgts<VI_char> %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "*altivec_gtv1ti"
> + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v")
> + (gt:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v")
> + (match_operand:V1TI 2 "altivec_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpgtsq %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_gtu<mode>"
> [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
> (gtu:VI2 (match_operand:VI2 1 "altivec_register_operand" "v")
> @@ -644,6 +664,14 @@
> "vcmpgtu<VI_char> %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "*altivec_gtuv1ti"
> + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v")
> + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v")
> + (match_operand:V1TI 2 "altivec_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpgtuq %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_eqv4sf"
> [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
> (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v")
> @@ -1687,6 +1715,19 @@
> DONE;
> })
>
> +(define_expand "vec_widen_umult_even_v2di"
> + [(use (match_operand:V1TI 0 "register_operand"))
> + (use (match_operand:V2DI 1 "register_operand"))
> + (use (match_operand:V2DI 2 "register_operand"))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + if (BYTES_BIG_ENDIAN)
> + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2]));
> + DONE;
> +})
> +
> (define_expand "vec_widen_smult_even_v4si"
> [(use (match_operand:V2DI 0 "register_operand"))
> (use (match_operand:V4SI 1 "register_operand"))
> @@ -1695,11 +1736,24 @@
> {
> if (BYTES_BIG_ENDIAN)
> emit_insn (gen_altivec_vmulesw (operands[0], operands[1], operands[2]));
> - else
> + else
> emit_insn (gen_altivec_vmulosw (operands[0], operands[1], operands[2]));
> DONE;
> })
>
> +(define_expand "vec_widen_smult_even_v2di"
> + [(use (match_operand:V1TI 0 "register_operand"))
> + (use (match_operand:V2DI 1 "register_operand"))
> + (use (match_operand:V2DI 2 "register_operand"))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + if (BYTES_BIG_ENDIAN)
> + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2]));
> + DONE;
> +})
> +
> (define_expand "vec_widen_umult_odd_v16qi"
> [(use (match_operand:V8HI 0 "register_operand"))
> (use (match_operand:V16QI 1 "register_operand"))
> @@ -1765,6 +1819,19 @@
> DONE;
> })
>
> +(define_expand "vec_widen_umult_odd_v2di"
> + [(use (match_operand:V1TI 0 "register_operand"))
> + (use (match_operand:V2DI 1 "register_operand"))
> + (use (match_operand:V2DI 2 "register_operand"))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + if (BYTES_BIG_ENDIAN)
> + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2]));
> + DONE;
> +})
> +
> (define_expand "vec_widen_smult_odd_v4si"
> [(use (match_operand:V2DI 0 "register_operand"))
> (use (match_operand:V4SI 1 "register_operand"))
> @@ -1778,6 +1845,19 @@
> DONE;
> })
>
> +(define_expand "vec_widen_smult_odd_v2di"
> + [(use (match_operand:V1TI 0 "register_operand"))
> + (use (match_operand:V2DI 1 "register_operand"))
> + (use (match_operand:V2DI 2 "register_operand"))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + if (BYTES_BIG_ENDIAN)
> + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2]));
> + DONE;
> +})
> +
> (define_insn "altivec_vmuleub"
> [(set (match_operand:V8HI 0 "register_operand" "=v")
> (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v")
> @@ -1859,6 +1939,15 @@
> "vmuleuw %0,%1,%2"
> [(set_attr "type" "veccomplex")])
>
> +(define_insn "altivec_vmuleud"
> + [(set (match_operand:V1TI 0 "register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> + (match_operand:V2DI 2 "register_operand" "v")]
> + UNSPEC_VMULEUD))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmuleud %0,%1,%2"
> + [(set_attr "type" "veccomplex")])
> +
> (define_insn "altivec_vmulouw"
> [(set (match_operand:V2DI 0 "register_operand" "=v")
> (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v")
> @@ -1868,6 +1957,15 @@
> "vmulouw %0,%1,%2"
> [(set_attr "type" "veccomplex")])
>
> +(define_insn "altivec_vmuloud"
> + [(set (match_operand:V1TI 0 "register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> + (match_operand:V2DI 2 "register_operand" "v")]
> + UNSPEC_VMULOUD))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmuloud %0,%1,%2"
> + [(set_attr "type" "veccomplex")])
> +
> (define_insn "altivec_vmulesw"
> [(set (match_operand:V2DI 0 "register_operand" "=v")
> (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v")
> @@ -1877,6 +1975,15 @@
> "vmulesw %0,%1,%2"
> [(set_attr "type" "veccomplex")])
>
> +(define_insn "altivec_vmulesd"
> + [(set (match_operand:V1TI 0 "register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> + (match_operand:V2DI 2 "register_operand" "v")]
> + UNSPEC_VMULESD))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmulesd %0,%1,%2"
> + [(set_attr "type" "veccomplex")])
> +
> (define_insn "altivec_vmulosw"
> [(set (match_operand:V2DI 0 "register_operand" "=v")
> (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v")
> @@ -1886,6 +1993,15 @@
> "vmulosw %0,%1,%2"
> [(set_attr "type" "veccomplex")])
>
> +(define_insn "altivec_vmulosd"
> + [(set (match_operand:V1TI 0 "register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> + (match_operand:V2DI 2 "register_operand" "v")]
> + UNSPEC_VMULOSD))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmulosd %0,%1,%2"
> + [(set_attr "type" "veccomplex")])
> +
> ;; Vector pack/unpack
> (define_insn "altivec_vpkpx"
> [(set (match_operand:V8HI 0 "register_operand" "=v")
> @@ -1979,6 +2095,15 @@
> "vrl<VI_char> %0,%1,%2"
> [(set_attr "type" "vecsimple")])
>
> +(define_insn "altivec_vrlq"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> +;; rotate amount in needs to be in bits[57:63] of operand2.
> + "vrlq %0,%1,%2"
> + [(set_attr "type" "vecsimple")])
> +
> (define_insn "altivec_vrl<VI_char>mi"
> [(set (match_operand:VIlong 0 "register_operand" "=v")
> (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "0")
> @@ -1989,6 +2114,33 @@
> "vrl<VI_char>mi %0,%2,%3"
> [(set_attr "type" "veclogical")])
>
> +(define_expand "altivec_vrlqmi"
> + [(set (match_operand:V1TI 0 "vsx_register_operand")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand")
> + (match_operand:V1TI 2 "vsx_register_operand")
> + (match_operand:V1TI 3 "vsx_register_operand")]
> + UNSPEC_VRLMI))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Mask bit begin, end fields need to be in bits [41:55] of 128-bit operand2. */
> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[3]));
> + emit_insn(gen_altivec_vrlqmi_inst (operands[0], operands[1], operands[2], tmp));
> + DONE;
> +})
> +
> +(define_insn "altivec_vrlqmi_inst"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "0")
> + (match_operand:V1TI 3 "vsx_register_operand" "v")]
> + UNSPEC_VRLMI))]
> + "TARGET_TI_VECTOR_OPS"
> + "vrlqmi %0,%1,%3"
> + [(set_attr "type" "veclogical")])
> +
> (define_insn "altivec_vrl<VI_char>nm"
> [(set (match_operand:VIlong 0 "register_operand" "=v")
> (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v")
> @@ -1998,6 +2150,31 @@
> "vrl<VI_char>nm %0,%1,%2"
> [(set_attr "type" "veclogical")])
>
> +(define_expand "altivec_vrlqnm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand")
> + (match_operand:V1TI 2 "vsx_register_operand")]
> + UNSPEC_VRLNM))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> + emit_insn(gen_altivec_vrlqnm_inst (operands[0], operands[1], tmp));
> + DONE;
> +})
> +
> +(define_insn "altivec_vrlqnm_inst"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VRLNM))]
> + "TARGET_TI_VECTOR_OPS"
> + ;; rotate and mask bits need to be in upper 64-bits of operand2.
> + "vrlqnm %0,%1,%2"
> + [(set_attr "type" "veclogical")])
> +
> (define_insn "altivec_vsl"
> [(set (match_operand:V4SI 0 "register_operand" "=v")
> (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> @@ -2042,6 +2219,15 @@
> "vsl<VI_char> %0,%1,%2"
> [(set_attr "type" "vecsimple")])
>
> +(define_insn "altivec_vslq"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
> + "vslq %0,%1,%2"
> + [(set_attr "type" "vecsimple")])
> +
> (define_insn "*altivec_vsr<VI_char>"
> [(set (match_operand:VI2 0 "register_operand" "=v")
> (lshiftrt:VI2 (match_operand:VI2 1 "register_operand" "v")
> @@ -2050,6 +2236,15 @@
> "vsr<VI_char> %0,%1,%2"
> [(set_attr "type" "vecsimple")])
>
> +(define_insn "altivec_vsrq"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
> + "vsrq %0,%1,%2"
> + [(set_attr "type" "vecsimple")])
> +
> (define_insn "*altivec_vsra<VI_char>"
> [(set (match_operand:VI2 0 "register_operand" "=v")
> (ashiftrt:VI2 (match_operand:VI2 1 "register_operand" "v")
> @@ -2058,6 +2253,15 @@
> "vsra<VI_char> %0,%1,%2"
> [(set_attr "type" "vecsimple")])
>
> +(define_insn "altivec_vsraq"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
> + "vsraq %0,%1,%2"
> + [(set_attr "type" "vecsimple")])
> +
> (define_insn "altivec_vsr"
> [(set (match_operand:V4SI 0 "register_operand" "=v")
> (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> @@ -2618,6 +2822,18 @@
> "vcmpequ<VI_char>. %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "altivec_vcmpequt_p"
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand" "v")
> + (match_operand:V1TI 2 "altivec_register_operand" "v"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "altivec_register_operand" "=v")
> + (eq:V1TI (match_dup 1)
> + (match_dup 2)))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpequq. %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_vcmpgts<VI_char>_p"
> [(set (reg:CC CR6_REGNO)
> (unspec:CC [(gt:CC (match_operand:VI2 1 "register_operand" "v")
> @@ -2630,6 +2846,18 @@
> "vcmpgts<VI_char>. %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "*altivec_vcmpgtst_p"
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(gt:CC (match_operand:V1TI 1 "register_operand" "v")
> + (match_operand:V1TI 2 "register_operand" "v"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "register_operand" "=v")
> + (gt:V1TI (match_dup 1)
> + (match_dup 2)))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpgtsq. %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_vcmpgtu<VI_char>_p"
> [(set (reg:CC CR6_REGNO)
> (unspec:CC [(gtu:CC (match_operand:VI2 1 "register_operand" "v")
> @@ -2642,6 +2870,18 @@
> "vcmpgtu<VI_char>. %0,%1,%2"
> [(set_attr "type" "veccmpfx")])
>
> +(define_insn "*altivec_vcmpgtut_p"
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "register_operand" "v")
> + (match_operand:V1TI 2 "register_operand" "v"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "register_operand" "=v")
> + (gtu:V1TI (match_dup 1)
> + (match_dup 2)))]
> + "TARGET_TI_VECTOR_OPS"
> + "vcmpgtuq. %0,%1,%2"
> + [(set_attr "type" "veccmpfx")])
> +
> (define_insn "*altivec_vcmpeqfp_p"
> [(set (reg:CC CR6_REGNO)
> (unspec:CC [(eq:CC (match_operand:V4SF 1 "register_operand" "v")
> diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
> index 667c2450d41..871da6c4cf7 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -1070,6 +1070,15 @@
> | RS6000_BTC_UNARY), \
> CODE_FOR_ ## ICODE) /* ICODE */
>
> +
> +#define BU_P10_P(ENUM, NAME, ATTR, ICODE) \
> + RS6000_BUILTIN_P (P10_BUILTIN_ ## ENUM, /* ENUM */ \
> + "__builtin_altivec_" NAME, /* NAME */ \
> + RS6000_BTM_P10_128BIT, /* MASK */ \
> + (RS6000_BTC_ ## ATTR /* ATTR */ \
> + | RS6000_BTC_PREDICATE), \
> + CODE_FOR_ ## ICODE) /* ICODE */
> +
> #define BU_P10_OVERLOAD_1(ENUM, NAME) \
> RS6000_BUILTIN_1 (P10_BUILTIN_VEC_ ## ENUM, /* ENUM */ \
> "__builtin_vec_" NAME, /* NAME */ \
> @@ -1152,6 +1161,30 @@
> (RS6000_BTC_ ## ATTR /* ATTR */ \
> | RS6000_BTC_BINARY), \
> CODE_FOR_ ## ICODE) /* ICODE */
> +
> +#define BU_P10_128BIT_1(ENUM, NAME, ATTR, ICODE) \
> + RS6000_BUILTIN_1 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \
> + "__builtin_altivec_" NAME, /* NAME */ \
> + RS6000_BTM_P10_128BIT, /* MASK */ \
> + (RS6000_BTC_ ## ATTR /* ATTR */ \
> + | RS6000_BTC_UNARY), \
> + CODE_FOR_ ## ICODE) /* ICODE */
> +
> +#define BU_P10_128BIT_2(ENUM, NAME, ATTR, ICODE) \
> + RS6000_BUILTIN_2 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \
> + "__builtin_altivec_" NAME, /* NAME */ \
> + RS6000_BTM_P10_128BIT, /* MASK */ \
> + (RS6000_BTC_ ## ATTR /* ATTR */ \
> + | RS6000_BTC_BINARY), \
> + CODE_FOR_ ## ICODE) /* ICODE */
> +
> +#define BU_P10_128BIT_3(ENUM, NAME, ATTR, ICODE) \
> + RS6000_BUILTIN_3 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \
> + "__builtin_altivec_" NAME, /* NAME */ \
> + RS6000_BTM_P10_128BIT, /* MASK */ \
> + (RS6000_BTC_ ## ATTR /* ATTR */ \
> + | RS6000_BTC_TERNARY), \
> + CODE_FOR_ ## ICODE) /* ICODE */
> #endif
>
>
> @@ -2712,6 +2745,10 @@ BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST, vsx_sign_extend_hi_v2di)
> BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST, vsx_sign_extend_si_v2di)
>
> /* Builtins for scalar instructions added in ISA 3.1 (power10). */
> +BU_P10_P (VCMPEQUT_P, "vcmpequt_p", CONST, vector_eq_v1ti_p)
> +BU_P10_P (VCMPGTST_P, "vcmpgtst_p", CONST, vector_gt_v1ti_p)
> +BU_P10_P (VCMPGTUT_P, "vcmpgtut_p", CONST, vector_gtu_v1ti_p)
> +
> BU_P10_MISC_2 (CFUGED, "cfuged", CONST, cfuged)
> BU_P10_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm)
> BU_P10_MISC_2 (CNTTZDM, "cnttzdm", CONST, cnttzdm)
> @@ -2733,6 +2770,39 @@ BU_P10V_2 (XXGENPCVM_V8HI, "xxgenpcvm_v8hi", CONST, xxgenpcvm_v8hi)
> BU_P10V_2 (XXGENPCVM_V4SI, "xxgenpcvm_v4si", CONST, xxgenpcvm_v4si)
> BU_P10V_2 (XXGENPCVM_V2DI, "xxgenpcvm_v2di", CONST, xxgenpcvm_v2di)
>
> +BU_P10V_2 (VCMPGTUT, "vcmpgtut", CONST, vector_gtuv1ti)
> +BU_P10V_2 (VCMPGTST, "vcmpgtst", CONST, vector_gtv1ti)
> +BU_P10V_2 (VCMPEQUT, "vcmpequt", CONST, vector_eqv1ti)
> +BU_P10V_2 (CMPNET, "vcmpnet", CONST, vcmpnet)
> +BU_P10V_2 (CMPGE_1TI, "cmpge_1ti", CONST, vector_nltv1ti)
> +BU_P10V_2 (CMPGE_U1TI, "cmpge_u1ti", CONST, vector_nltuv1ti)
> +BU_P10V_2 (CMPLE_1TI, "cmple_1ti", CONST, vector_ngtv1ti)
> +BU_P10V_2 (CMPLE_U1TI, "cmple_u1ti", CONST, vector_ngtuv1ti)
> +BU_P10V_2 (VNOR_V1TI_UNS, "vnor_v1ti_uns",CONST, norv1ti3)
> +BU_P10V_2 (VNOR_V1TI, "vnor_v1ti", CONST, norv1ti3)
> +BU_P10V_2 (VCMPNET_P, "vcmpnet_p", CONST, vector_ne_v1ti_p)
> +BU_P10V_2 (VCMPAET_P, "vcmpaet_p", CONST, vector_ae_v1ti_p)
> +
> +BU_P10_128BIT_1 (VSIGNEXTSD2Q, "vsignext", CONST, vsx_sign_extend_v2di_v1ti)
> +
> +BU_P10_128BIT_2 (VMULEUD, "vmuleud", CONST, vec_widen_umult_even_v2di)
> +BU_P10_128BIT_2 (VMULESD, "vmulesd", CONST, vec_widen_smult_even_v2di)
> +BU_P10_128BIT_2 (VMULOUD, "vmuloud", CONST, vec_widen_umult_odd_v2di)
> +BU_P10_128BIT_2 (VMULOSD, "vmulosd", CONST, vec_widen_smult_odd_v2di)
> +BU_P10_128BIT_2 (VRLQ, "vrlq", CONST, vrotlv1ti3)
> +BU_P10_128BIT_2 (VSLQ, "vslq", CONST, vashlv1ti3)
> +BU_P10_128BIT_2 (VSRQ, "vsrq", CONST, vlshrv1ti3)
> +BU_P10_128BIT_2 (VSRAQ, "vsraq", CONST, vashrv1ti3)
> +BU_P10_128BIT_2 (VRLQNM, "vrlqnm", CONST, altivec_vrlqnm)
> +BU_P10_128BIT_2 (DIV_V1TI, "div_1ti", CONST, vsx_div_v1ti)
> +BU_P10_128BIT_2 (UDIV_V1TI, "udiv_1ti", CONST, vsx_udiv_v1ti)
> +BU_P10_128BIT_2 (DIVES_V1TI, "dives", CONST, vsx_dives_v1ti)
> +BU_P10_128BIT_2 (DIVEU_V1TI, "diveu", CONST, vsx_diveu_v1ti)
> +BU_P10_128BIT_2 (MODS_V1TI, "mods", CONST, vsx_mods_v1ti)
> +BU_P10_128BIT_2 (MODU_V1TI, "modu", CONST, vsx_modu_v1ti)
> +
> +BU_P10_128BIT_3 (VRLQMI, "vrlqmi", CONST, altivec_vrlqmi)
> +
> BU_P10V_3 (VEXTRACTBL, "vextdubvlx", CONST, vextractlv16qi)
> BU_P10V_3 (VEXTRACTHL, "vextduhvlx", CONST, vextractlv8hi)
> BU_P10V_3 (VEXTRACTWL, "vextduwvlx", CONST, vextractlv4si)
> @@ -2839,6 +2909,12 @@ BU_P10_OVERLOAD_2 (CLRR, "clrr")
> BU_P10_OVERLOAD_2 (GNB, "gnb")
> BU_P10_OVERLOAD_4 (XXEVAL, "xxeval")
> BU_P10_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm")
> +BU_P10_OVERLOAD_2 (VRLQ, "vrlq")
> +BU_P10_OVERLOAD_2 (VSLQ, "vslq")
> +BU_P10_OVERLOAD_2 (VSRQ, "vsrq")
> +BU_P10_OVERLOAD_2 (VSRAQ, "vsraq")
> +BU_P10_OVERLOAD_2 (DIVE, "dive")
> +BU_P10_OVERLOAD_2 (MOD, "mod")
>
> BU_P10_OVERLOAD_3 (EXTRACTL, "extractl")
> BU_P10_OVERLOAD_3 (EXTRACTH, "extracth")
> @@ -2854,6 +2930,7 @@ BU_P10_OVERLOAD_1 (VSTRIL, "stril")
>
> BU_P10_OVERLOAD_1 (VSTRIR_P, "strir_p")
> BU_P10_OVERLOAD_1 (VSTRIL_P, "stril_p")
> +BU_P10_OVERLOAD_1 (SIGNEXT, "vsignextq")
>
> BU_P10_OVERLOAD_1 (XVTLSBB_ZEROS, "xvtlsbb_all_zeros")
> BU_P10_OVERLOAD_1 (XVTLSBB_ONES, "xvtlsbb_all_ones")
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 87699be8a07..2bd6412a502 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -839,6 +839,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD,
> RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10_BUILTIN_VCMPEQUT,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10_BUILTIN_VCMPEQUT,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQFP,
> RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPEQ, VSX_BUILTIN_XVCMPEQDP,
> @@ -885,6 +889,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { ALTIVEC_BUILTIN_VEC_CMPGE, VSX_BUILTIN_CMPGE_U2DI,
> RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI,
> RS6000_BTI_unsigned_V2DI, 0},
> + { ALTIVEC_BUILTIN_VEC_CMPGE, P10_BUILTIN_CMPGE_1TI,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0},
> + { ALTIVEC_BUILTIN_VEC_CMPGE, P10_BUILTIN_CMPGE_U1TI,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0},
> { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTUB,
> RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTSB,
> @@ -899,8 +908,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTUD,
> RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPGT, P10_BUILTIN_VCMPGTUT,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTSD,
> RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPGT, P10_BUILTIN_VCMPGTST,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTFP,
> RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPGT, VSX_BUILTIN_XVCMPGTDP,
> @@ -943,6 +956,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { ALTIVEC_BUILTIN_VEC_CMPLE, VSX_BUILTIN_CMPLE_U2DI,
> RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI,
> RS6000_BTI_unsigned_V2DI, 0},
> + { ALTIVEC_BUILTIN_VEC_CMPLE, P10_BUILTIN_CMPLE_1TI,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0},
> + { ALTIVEC_BUILTIN_VEC_CMPLE, P10_BUILTIN_CMPLE_U1TI,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0},
> { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTUB,
> RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
> { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTSB,
> @@ -995,6 +1013,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
> { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_UDIV_V2DI,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { VSX_BUILTIN_VEC_DIV, P10_BUILTIN_128BIT_DIV_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> + { VSX_BUILTIN_VEC_DIV, P10_BUILTIN_128BIT_UDIV_V1TI,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> +
> { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVSXDDP,
> RS6000_BTI_V2DF, RS6000_BTI_V2DI, 0, 0 },
> { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVUXDDP,
> @@ -1789,6 +1813,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { ALTIVEC_BUILTIN_VEC_MULE, P8V_BUILTIN_VMULEUW,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI,
> RS6000_BTI_unsigned_V4SI, 0 },
> + { ALTIVEC_BUILTIN_VEC_MULE, P10_BUILTIN_128BIT_VMULESD,
> + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_MULE, P10_BUILTIN_128BIT_VMULEUD,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI,
> + RS6000_BTI_unsigned_V2DI, 0 },
> +
> { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB,
> RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
> { ALTIVEC_BUILTIN_VEC_VMULESB, ALTIVEC_BUILTIN_VMULESB,
> @@ -1812,6 +1842,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { ALTIVEC_BUILTIN_VEC_MULO, P8V_BUILTIN_VMULOUW,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI,
> RS6000_BTI_unsigned_V4SI, 0 },
> + { ALTIVEC_BUILTIN_VEC_MULO, P10_BUILTIN_128BIT_VMULOSD,
> + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_MULO, P10_BUILTIN_128BIT_VMULOUD,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI,
> + RS6000_BTI_unsigned_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH,
> RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 },
> { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH,
> @@ -1860,6 +1895,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS,
> RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V4SI,
> @@ -2115,6 +2160,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_RL, P8V_BUILTIN_VRLD,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_RL, P10_BUILTIN_128BIT_VRLQ,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_RL, P10_BUILTIN_128BIT_VRLQ,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW,
> RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 },
> { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW,
> @@ -2133,12 +2183,23 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { P9V_BUILTIN_VEC_RLMI, P9V_BUILTIN_VRLDMI,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI },
> + { P9V_BUILTIN_VEC_RLMI, P10_BUILTIN_128BIT_VRLQMI,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI },
> + { P9V_BUILTIN_VEC_RLMI, P10_BUILTIN_128BIT_VRLQMI,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI },
> { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLWNM,
> RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
> RS6000_BTI_unsigned_V4SI, 0 },
> { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLDNM,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
> RS6000_BTI_unsigned_V2DI, 0 },
> + { P9V_BUILTIN_VEC_RLNM, P10_BUILTIN_128BIT_VRLQNM,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> + { P9V_BUILTIN_VEC_RLNM, P10_BUILTIN_128BIT_VRLQNM,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB,
> RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, 0 },
> { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB,
> @@ -2155,6 +2216,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_SL, P8V_BUILTIN_VSLD,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SL, P10_BUILTIN_128BIT_VSLQ,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SL, P10_BUILTIN_128BIT_VSLQ,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTDP,
> RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 },
> { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTSP,
> @@ -2351,6 +2417,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_SR, P8V_BUILTIN_VSRD,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SR, P10_BUILTIN_128BIT_VSRQ,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SR, P10_BUILTIN_128BIT_VSRQ,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW,
> RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 },
> { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW,
> @@ -2379,6 +2450,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> { ALTIVEC_BUILTIN_VEC_SRA, P8V_BUILTIN_VSRAD,
> RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SRA, P10_BUILTIN_128BIT_VSRAQ,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_SRA, P10_BUILTIN_128BIT_VSRAQ,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW,
> RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 },
> { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW,
> @@ -3996,12 +4072,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTUD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI },
> + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10_BUILTIN_VCMPGTUT_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10_BUILTIN_VCMPGTST_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTFP_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
> { ALTIVEC_BUILTIN_VEC_VCMPGT_P, VSX_BUILTIN_XVCMPGTDP_P,
> @@ -4066,6 +4146,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P8V_BUILTIN_VCMPEQUD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI },
> + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10_BUILTIN_VCMPEQUT_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI },
> + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10_BUILTIN_VCMPEQUT_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI },
> { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQFP_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
> { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, VSX_BUILTIN_XVCMPEQDP_P,
> @@ -4117,12 +4201,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTUD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI },
> + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10_BUILTIN_VCMPGTUT_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10_BUILTIN_VCMPGTST_P,
> + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, ALTIVEC_BUILTIN_VCMPGEFP_P,
> RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF },
> { ALTIVEC_BUILTIN_VEC_VCMPGE_P, VSX_BUILTIN_XVCMPGEDP_P,
> @@ -4771,6 +4859,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEW,
> RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI,
> RS6000_BTI_unsigned_V4SI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPNE, P10_BUILTIN_CMPNET,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI,
> + RS6000_BTI_V1TI, 0 },
> + { ALTIVEC_BUILTIN_VEC_CMPNE, P10_BUILTIN_CMPNET,
> + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
>
> /* The following 2 entries have been deprecated. */
> { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEB_P,
> @@ -4856,8 +4950,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_bool_V2DI, 0 },
> { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P,
> RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI,
> - RS6000_BTI_unsigned_V2DI, 0
> - },
> + RS6000_BTI_unsigned_V2DI, 0 },
> + { P9V_BUILTIN_VEC_VCMPNE_P, P10_BUILTIN_VCMPNET_P,
> + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
>
> /* The following 2 entries have been deprecated. */
> { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P,
> @@ -4871,6 +4967,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P,
> RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI,
> RS6000_BTI_bool_V2DI, 0 },
> + { P9V_BUILTIN_VEC_VCMPNE_P, P10_BUILTIN_VCMPNET_P,
> + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
>
> { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P,
> RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
> @@ -4961,8 +5059,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> RS6000_BTI_bool_V2DI, 0 },
> { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P,
> RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI,
> - RS6000_BTI_unsigned_V2DI, 0
> - },
> + RS6000_BTI_unsigned_V2DI, 0 },
> + { P9V_BUILTIN_VEC_VCMPAE_P, P10_BUILTIN_VCMPAET_P,
> + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
>
> /* The following 2 entries have been deprecated. */
> { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P,
> @@ -4976,7 +5076,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P,
> RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI,
> RS6000_BTI_bool_V2DI, 0 },
> -
> + { P9V_BUILTIN_VEC_VCMPAE_P, P10_BUILTIN_VCMPAET_P,
> + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P,
> RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
> { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P,
> @@ -5903,6 +6004,21 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> { P10_BUILTIN_VEC_XVTLSBB_ONES, P10_BUILTIN_XVTLSBB_ONES,
> RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 },
>
> + { P10_BUILTIN_VEC_SIGNEXT, P10_BUILTIN_128BIT_VSIGNEXTSD2Q,
> + RS6000_BTI_V1TI, RS6000_BTI_V2DI, 0, 0 },
> +
> + { P10_BUILTIN_VEC_DIVE, P10_BUILTIN_128BIT_DIVES_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> + { P10_BUILTIN_VEC_DIVE, P10_BUILTIN_128BIT_DIVEU_V1TI,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> +
> + { P10_BUILTIN_VEC_MOD, P10_BUILTIN_128BIT_MODS_V1TI,
> + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> + { P10_BUILTIN_VEC_MOD, P10_BUILTIN_128BIT_MODU_V1TI,
> + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
> + RS6000_BTI_unsigned_V1TI, 0 },
> +
> { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 }
> };
>
> @@ -12228,12 +12344,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> case ALTIVEC_BUILTIN_VCMPEQUH:
> case ALTIVEC_BUILTIN_VCMPEQUW:
> case P8V_BUILTIN_VCMPEQUD:
> + case P10_BUILTIN_VCMPEQUT:
> fold_compare_helper (gsi, EQ_EXPR, stmt);
> return true;
>
> case P9V_BUILTIN_CMPNEB:
> case P9V_BUILTIN_CMPNEH:
> case P9V_BUILTIN_CMPNEW:
> + case P10_BUILTIN_CMPNET:
> fold_compare_helper (gsi, NE_EXPR, stmt);
> return true;
>
> @@ -12245,6 +12363,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> case VSX_BUILTIN_CMPGE_U4SI:
> case VSX_BUILTIN_CMPGE_2DI:
> case VSX_BUILTIN_CMPGE_U2DI:
> + case P10_BUILTIN_CMPGE_1TI:
> + case P10_BUILTIN_CMPGE_U1TI:
> fold_compare_helper (gsi, GE_EXPR, stmt);
> return true;
>
> @@ -12256,6 +12376,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> case ALTIVEC_BUILTIN_VCMPGTUW:
> case P8V_BUILTIN_VCMPGTUD:
> case P8V_BUILTIN_VCMPGTSD:
> + case P10_BUILTIN_VCMPGTUT:
> + case P10_BUILTIN_VCMPGTST:
> fold_compare_helper (gsi, GT_EXPR, stmt);
> return true;
>
> @@ -12267,6 +12389,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> case VSX_BUILTIN_CMPLE_U4SI:
> case VSX_BUILTIN_CMPLE_2DI:
> case VSX_BUILTIN_CMPLE_U2DI:
> + case P10_BUILTIN_CMPLE_1TI:
> + case P10_BUILTIN_CMPLE_U1TI:
> fold_compare_helper (gsi, LE_EXPR, stmt);
> return true;
>
> @@ -12978,6 +13102,8 @@ rs6000_init_builtins (void)
> ? "__vector __bool long"
> : "__vector __bool long long",
> bool_long_long_type_node, 2);
> + bool_V1TI_type_node = rs6000_vector_type ("__vector __bool __int128",
> + intTI_type_node, 1);
> pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
> pixel_type_node, 8);
>
> @@ -13163,6 +13289,10 @@ altivec_init_builtins (void)
> = build_function_type_list (integer_type_node,
> integer_type_node, V2DI_type_node,
> V2DI_type_node, NULL_TREE);
> + tree int_ftype_int_v1ti_v1ti
> + = build_function_type_list (integer_type_node,
> + integer_type_node, V1TI_type_node,
> + V1TI_type_node, NULL_TREE);
> tree void_ftype_v4si
> = build_function_type_list (void_type_node, V4SI_type_node, NULL_TREE);
> tree v8hi_ftype_void
> @@ -13515,6 +13645,9 @@ altivec_init_builtins (void)
> case E_VOIDmode:
> type = int_ftype_int_opaque_opaque;
> break;
> + case E_V1TImode:
> + type = int_ftype_int_v1ti_v1ti;
> + break;
> case E_V2DImode:
> type = int_ftype_int_v2di_v2di;
> break;
> @@ -14114,6 +14247,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
> case P10_BUILTIN_XXGENPCVM_V8HI:
> case P10_BUILTIN_XXGENPCVM_V4SI:
> case P10_BUILTIN_XXGENPCVM_V2DI:
> + case P10_BUILTIN_128BIT_VMULEUD:
> + case P10_BUILTIN_128BIT_VMULOUD:
> + case P10_BUILTIN_128BIT_DIVEU_V1TI:
> + case P10_BUILTIN_128BIT_MODU_V1TI:
> h.uns_p[0] = 1;
> h.uns_p[1] = 1;
> h.uns_p[2] = 1;
> @@ -14213,10 +14350,13 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
> case VSX_BUILTIN_CMPGE_U8HI:
> case VSX_BUILTIN_CMPGE_U4SI:
> case VSX_BUILTIN_CMPGE_U2DI:
> + case P10_BUILTIN_CMPGE_U1TI:
> case ALTIVEC_BUILTIN_VCMPGTUB:
> case ALTIVEC_BUILTIN_VCMPGTUH:
> case ALTIVEC_BUILTIN_VCMPGTUW:
> case P8V_BUILTIN_VCMPGTUD:
> + case P10_BUILTIN_VCMPGTUT:
> + case P10_BUILTIN_VCMPEQUT:
> h.uns_p[1] = 1;
> h.uns_p[2] = 1;
> break;
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 40ee0a695f1..1fa4a527f12 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -3401,7 +3401,9 @@ rs6000_builtin_mask_calculate (void)
> | ((TARGET_FLOAT128_TYPE) ? RS6000_BTM_FLOAT128 : 0)
> | ((TARGET_FLOAT128_HW) ? RS6000_BTM_FLOAT128_HW : 0)
> | ((TARGET_MMA) ? RS6000_BTM_MMA : 0)
> - | ((TARGET_POWER10) ? RS6000_BTM_P10 : 0));
> + | ((TARGET_POWER10) ? RS6000_BTM_P10 : 0)
> + | ((TARGET_TI_VECTOR_OPS) ? RS6000_BTM_TI_VECTOR_OPS : 0));
> +
> }
>
> /* Implement TARGET_MD_ASM_ADJUST. All asm statements are considered
> @@ -3732,6 +3734,17 @@ rs6000_option_override_internal (bool global_init_p)
> if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
> rs6000_print_isa_options (stderr, 0, "before defaults", rs6000_isa_flags);
>
> + /* The -mti-vector-ops option requires ISA 3.1 support and -maltivec for
> + the 128-bit instructions. Currently, TARGET_POWER10 is sufficient to
> + enable it by default. */
> + if (TARGET_POWER10)
> + {
> + if (rs6000_isa_flags_explicit & OPTION_MASK_VSX)
> + warning(0, ("%<-mno-altivec%> disables -mti-vector-ops (128-bit integer vector register operations)."));
> + else
> + rs6000_isa_flags |= OPTION_MASK_TI_VECTOR_OPS;
> + }
It seems odd here that -maltivec is explicitly called out here. That
should be default on for quite a while at this point.
> +
> /* Handle explicit -mno-{altivec,vsx,power8-vector,power9-vector} and turn
> off all of the options that depend on those flags. */
> ignore_masks = rs6000_disable_incompatible_switches ();
> @@ -19489,6 +19502,7 @@ rs6000_handle_altivec_attribute (tree *node,
> case 'b':
> switch (mode)
> {
> + case E_TImode: case E_V1TImode: result = bool_V1TI_type_node; break;
> case E_DImode: case E_V2DImode: result = bool_V2DI_type_node; break;
> case E_SImode: case E_V4SImode: result = bool_V4SI_type_node; break;
> case E_HImode: case E_V8HImode: result = bool_V8HI_type_node; break;
> @@ -23218,6 +23232,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
> { "float128-hardware", OPTION_MASK_FLOAT128_HW, false, true },
> { "fprnd", OPTION_MASK_FPRND, false, true },
> { "power10", OPTION_MASK_POWER10, false, true },
> + { "ti-vector-ops", OPTION_MASK_TI_VECTOR_OPS, false, true },
> { "hard-dfp", OPTION_MASK_DFP, false, true },
> { "htm", OPTION_MASK_HTM, false, true },
> { "isel", OPTION_MASK_ISEL, false, true },
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index bbd8060e143..da84abde671 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
> #define MASK_UPDATE OPTION_MASK_UPDATE
> #define MASK_VSX OPTION_MASK_VSX
> #define MASK_POWER10 OPTION_MASK_POWER10
> +#define MASK_TI_VECTOR_OPS OPTION_MASK_TI_VECTOR_OPS
>
> #ifndef IN_LIBGCC2
> #define MASK_POWERPC64 OPTION_MASK_POWERPC64
> @@ -2305,6 +2306,7 @@ extern int frame_pointer_needed;
> #define RS6000_BTM_P8_VECTOR MASK_P8_VECTOR /* ISA 2.07 vector. */
> #define RS6000_BTM_P9_VECTOR MASK_P9_VECTOR /* ISA 3.0 vector. */
> #define RS6000_BTM_P9_MISC MASK_P9_MISC /* ISA 3.0 misc. non-vector */
> +#define RS6000_BTM_P10_128BIT MASK_POWER10 /* ISA P10 vector. */
Should comment be 128-bit something? (not just P10 vector).
> #define RS6000_BTM_CRYPTO MASK_CRYPTO /* crypto funcs. */
> #define RS6000_BTM_HTM MASK_HTM /* hardware TM funcs. */
> #define RS6000_BTM_FRE MASK_POPCNTB /* FRE instruction. */
> @@ -2322,7 +2324,7 @@ extern int frame_pointer_needed;
> #define RS6000_BTM_FLOAT128_HW MASK_FLOAT128_HW /* IEEE 128-bit float h/w. */
> #define RS6000_BTM_MMA MASK_MMA /* ISA 3.1 MMA. */
> #define RS6000_BTM_P10 MASK_POWER10
> -
> +#define RS6000_BTM_TI_VECTOR_OPS MASK_TI_VECTOR_OPS /* 128-bit integer support */
>
> #define RS6000_BTM_COMMON (RS6000_BTM_ALTIVEC \
> | RS6000_BTM_VSX \
> @@ -2436,6 +2438,7 @@ enum rs6000_builtin_type_index
> RS6000_BTI_bool_V8HI, /* __vector __bool short */
> RS6000_BTI_bool_V4SI, /* __vector __bool int */
> RS6000_BTI_bool_V2DI, /* __vector __bool long */
> + RS6000_BTI_bool_V1TI, /* __vector __bool long */
Fix comment?
> RS6000_BTI_pixel_V8HI, /* __vector __pixel */
> RS6000_BTI_long, /* long_integer_type_node */
> RS6000_BTI_unsigned_long, /* long_unsigned_type_node */
> @@ -2489,6 +2492,7 @@ enum rs6000_builtin_type_index
> #define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI])
> #define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI])
> #define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI])
> +#define bool_V1TI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V1TI])
> #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI])
>
> #define long_long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long_long])
> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 9d3e740e930..67d667bf1fd 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -585,3 +585,7 @@ Generate (do not generate) pc-relative memory addressing.
> mmma
> Target Report Mask(MMA) Var(rs6000_isa_flags)
> Generate (do not generate) MMA instructions.
> +
> +mti-vector-ops
> +Target Report Mask(TI_VECTOR_OPS) Var(rs6000_isa_flags)
> +Use integer 128-bit instructions for a future architecture.
'future' can probably be adjusted.
> \ No newline at end of file
diff error?
> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
> index 796345c80d3..2deff282076 100644
> --- a/gcc/config/rs6000/vector.md
> +++ b/gcc/config/rs6000/vector.md
> @@ -678,6 +678,13 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_eqv1ti"
> + [(set (match_operand:V1TI 0 "vlogical_operand")
> + (eq:V1TI (match_operand:V1TI 1 "vlogical_operand")
> + (match_operand:V1TI 2 "vlogical_operand")))]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> (define_expand "vector_gt<mode>"
> [(set (match_operand:VEC_C 0 "vlogical_operand")
> (gt:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
> @@ -685,6 +692,13 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_gtv1ti"
> + [(set (match_operand:V1TI 0 "vlogical_operand")
> + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand")
> + (match_operand:V1TI 2 "vlogical_operand")))]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> ; >= for integer vectors: swap operands and apply not-greater-than
> (define_expand "vector_nlt<mode>"
> [(set (match_operand:VEC_I 3 "vlogical_operand")
> @@ -697,6 +711,17 @@
> operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> })
>
> +(define_expand "vector_nltv1ti"
> + [(set (match_operand:V1TI 3 "vlogical_operand")
> + (gt:V1TI (match_operand:V1TI 2 "vlogical_operand")
> + (match_operand:V1TI 1 "vlogical_operand")))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (not:V1TI (match_dup 3)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> +})
> +
> (define_expand "vector_gtu<mode>"
> [(set (match_operand:VEC_I 0 "vint_operand")
> (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand")
> @@ -704,6 +729,13 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_gtuv1ti"
> + [(set (match_operand:V1TI 0 "altivec_register_operand")
> + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand")))]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> ; >= for integer vectors: swap operands and apply not-greater-than
> (define_expand "vector_nltu<mode>"
> [(set (match_operand:VEC_I 3 "vlogical_operand")
> @@ -716,6 +748,17 @@
> operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> })
>
> +(define_expand "vector_nltuv1ti"
> + [(set (match_operand:V1TI 3 "vlogical_operand")
> + (gtu:V1TI (match_operand:V1TI 2 "vlogical_operand")
> + (match_operand:V1TI 1 "vlogical_operand")))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (not:V1TI (match_dup 3)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> +})
> +
> (define_expand "vector_geu<mode>"
> [(set (match_operand:VEC_I 0 "vint_operand")
> (geu:VEC_I (match_operand:VEC_I 1 "vint_operand")
> @@ -735,6 +778,17 @@
> operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> })
>
> +(define_expand "vector_ngtv1ti"
> + [(set (match_operand:V1TI 3 "vlogical_operand")
> + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand")
> + (match_operand:V1TI 2 "vlogical_operand")))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (not:V1TI (match_dup 3)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> +})
> +
> (define_expand "vector_ngtu<mode>"
> [(set (match_operand:VEC_I 3 "vlogical_operand")
> (gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand")
> @@ -746,6 +800,17 @@
> operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> })
>
> +(define_expand "vector_ngtuv1ti"
> + [(set (match_operand:V1TI 3 "vlogical_operand")
> + (gtu:V1TI (match_operand:V1TI 1 "vlogical_operand")
> + (match_operand:V1TI 2 "vlogical_operand")))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (not:V1TI (match_dup 3)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx_and_attrs (operands[0]);
> +})
> +
> ; There are 14 possible vector FP comparison operators, gt and eq of them have
> ; been expanded above, so just support 12 remaining operators here.
>
> @@ -894,6 +959,18 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_eq_v1ti_p"
> + [(parallel
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (eq:V1TI (match_dup 1)
> + (match_dup 2)))])]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> ;; This expansion handles the V16QI, V8HI, and V4SI modes in the
> ;; implementation of the vec_all_ne built-in functions on Power9.
> (define_expand "vector_ne_<mode>_p"
> @@ -976,6 +1053,23 @@
> operands[3] = gen_reg_rtx (V2DImode);
> })
>
> +(define_expand "vector_ne_v1ti_p"
> + [(parallel
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand"))]
> + UNSPEC_PREDICATE))
> + (set (match_dup 3)
> + (eq:V1TI (match_dup 1)
> + (match_dup 2)))])
> + (set (match_operand:SI 0 "register_operand" "=r")
> + (eq:SI (reg:CC CR6_REGNO)
> + (const_int 0)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx (V1TImode);
> +})
> +
> ;; This expansion handles the V2DI mode in the implementation of the
> ;; vec_any_eq built-in function on Power9.
> ;;
> @@ -1002,6 +1096,27 @@
> operands[3] = gen_reg_rtx (V2DImode);
> })
>
> +;; Power 10
Meaningful comment?
> +(define_expand "vector_ae_v1ti_p"
> + [(parallel
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand"))]
> + UNSPEC_PREDICATE))
> + (set (match_dup 3)
> + (eq:V1TI (match_dup 1)
> + (match_dup 2)))])
> + (set (match_operand:SI 0 "register_operand" "=r")
> + (eq:SI (reg:CC CR6_REGNO)
> + (const_int 0)))
> + (set (match_dup 0)
> + (xor:SI (match_dup 0)
> + (const_int 1)))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + operands[3] = gen_reg_rtx (V1TImode);
> +})
> +
> ;; This expansion handles the V4SF and V2DF modes in the Power9
> ;; implementation of the vec_all_ne built-in functions. Note that the
> ;; expansions for this pattern with these modes makes no use of power9-
> @@ -1061,6 +1176,18 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_gt_v1ti_p"
> + [(parallel
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(gt:CC (match_operand:V1TI 1 "vlogical_operand")
> + (match_operand:V1TI 2 "vlogical_operand"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "vlogical_operand")
> + (gt:V1TI (match_dup 1)
> + (match_dup 2)))])]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> (define_expand "vector_ge_<mode>_p"
> [(parallel
> [(set (reg:CC CR6_REGNO)
> @@ -1085,6 +1212,18 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vector_gtu_v1ti_p"
> + [(parallel
> + [(set (reg:CC CR6_REGNO)
> + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand"))]
> + UNSPEC_PREDICATE))
> + (set (match_operand:V1TI 0 "altivec_register_operand")
> + (gtu:V1TI (match_dup 1)
> + (match_dup 2)))])]
> + "TARGET_TI_VECTOR_OPS"
> + "")
> +
> ;; AltiVec/VSX predicates.
>
> ;; This expansion is triggered during expansion of predicate built-in
> @@ -1460,6 +1599,20 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +(define_expand "vrotlv1ti3"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> + emit_insn(gen_altivec_vrlq (operands[0], operands[1], tmp));
> + DONE;
> +})
> +
> ;; Expanders for rotatert to make use of vrotl
> (define_expand "vrotr<mode>3"
> [(set (match_operand:VEC_I 0 "vint_operand")
> @@ -1481,6 +1634,21 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +;; No immediate version of this 128-bit instruction
> +(define_expand "vashlv1ti3"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> + emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp));
> + DONE;
> +})
> +
> ;; Expanders for logical shift right on each vector element
> (define_expand "vlshr<mode>3"
> [(set (match_operand:VEC_I 0 "vint_operand")
> @@ -1489,6 +1657,21 @@
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
>
> +;; No immediate version of this 128-bit instruction
> +(define_expand "vlshrv1ti3"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> + emit_insn(gen_altivec_vsrq (operands[0], operands[1], tmp));
> + DONE;
> +})
> +
> ;; Expanders for arithmetic shift right on each vector element
> (define_expand "vashr<mode>3"
> [(set (match_operand:VEC_I 0 "vint_operand")
> @@ -1496,6 +1679,22 @@
> (match_operand:VEC_I 2 "vint_operand")))]
> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> "")
> +
> +;; No immediate version of this 128-bit instruction
> +(define_expand "vashrv1ti3"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */
> + rtx tmp = gen_reg_rtx (V1TImode);
> +
> + emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> + emit_insn(gen_altivec_vsraq (operands[0], operands[1], tmp));
> + DONE;
> +})
> +
>
> ;; Vector reduction expanders for VSX
> ; The (VEC_reduc:...
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 1153a01b4ef..998af3908ad 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -298,6 +298,12 @@
> UNSPEC_VSX_XXSPLTD
> UNSPEC_VSX_DIVSD
> UNSPEC_VSX_DIVUD
> + UNSPEC_VSX_DIVSQ
> + UNSPEC_VSX_DIVUQ
> + UNSPEC_VSX_DIVESQ
> + UNSPEC_VSX_DIVEUQ
> + UNSPEC_VSX_MODSQ
> + UNSPEC_VSX_MODUQ
> UNSPEC_VSX_MULSD
> UNSPEC_VSX_SIGN_EXTEND
> UNSPEC_VSX_XVCVBF16SP
> @@ -361,6 +367,7 @@
> UNSPEC_INSERTR
> UNSPEC_REPLACE_ELT
> UNSPEC_REPLACE_UN
> + UNSPEC_XXSWAPD_V1TI
> ])
>
> (define_int_iterator XVCVBF16 [UNSPEC_VSX_XVCVSPBF16
> @@ -1732,7 +1739,61 @@
> }
> [(set_attr "type" "div")])
>
> -;; *tdiv* instruction returning the FG flag
> +(define_insn "vsx_div_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_DIVSQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vdivsq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> +(define_insn "vsx_udiv_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_DIVUQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vdivuq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> +(define_insn "vsx_dives_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_DIVESQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vdivesq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> +(define_insn "vsx_diveu_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_DIVEUQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vdiveuq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> +(define_insn "vsx_mods_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_MODSQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmodsq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> +(define_insn "vsx_modu_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (match_operand:V1TI 2 "vsx_register_operand" "v")]
> + UNSPEC_VSX_MODUQ))]
> + "TARGET_TI_VECTOR_OPS"
> + "vmoduq %0,%1,%2"
> + [(set_attr "type" "div")])
> +
> + ;; *tdiv* instruction returning the FG flag
> (define_expand "vsx_tdiv<mode>3_fg"
> [(set (match_dup 3)
> (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand")
> @@ -3083,6 +3144,18 @@
> "xxpermdi %x0,%x1,%x1,2"
> [(set_attr "type" "vecperm")])
>
> +;; Swap upper/lower 64-bit values in a 128-bit vector
> +(define_insn "xxswapd_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v")
> + (parallel [(const_int 0)(const_int 1)])]
> + UNSPEC_XXSWAPD_V1TI))]
> + "TARGET_POWER10"
> +;; AIX does not support extended mnemonic xxswapd. Use the basic
> +;; mnemonic xxpermdi instead.
> + "xxpermdi %x0,%x1,%x1,2"
> + [(set_attr "type" "vecperm")])
> +
> (define_insn "xxgenpcvm_<mode>_internal"
> [(set (match_operand:VSX_EXTRACT_I4 0 "altivec_register_operand" "=wa")
> (unspec:VSX_EXTRACT_I4
> @@ -4767,8 +4840,16 @@
> (set_attr "type" "vecload")])
>
>
> -;; ISA 3.0 vector extend sign support
> +;; ISA 3.1 vector extend sign support
> +(define_insn "vsx_sign_extend_v2di_v1ti"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")]
> + UNSPEC_VSX_SIGN_EXTEND))]
> + "TARGET_TI_VECTOR_OPS"
> + "vextsd2q %0,%1"
> + [(set_attr "type" "vecexts")])
>
> +;; ISA 3.0 vector extend sign support
> (define_insn "vsx_sign_extend_qi_<mode>"
> [(set (match_operand:VSINT_84 0 "vsx_register_operand" "=v")
> (unspec:VSINT_84
> @@ -5508,6 +5589,20 @@
> "vcmpnew %0,%1,%2"
> [(set_attr "type" "vecsimple")])
>
> +;; Vector Compare Not Equal v1ti (specified/not+eq:)
> +(define_expand "vcmpnet"
> + [(set (match_operand:V1TI 0 "altivec_register_operand")
> + (not:V1TI
> + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand")
> + (match_operand:V1TI 2 "altivec_register_operand"))))]
> + "TARGET_TI_VECTOR_OPS"
> +{
> + emit_insn (gen_vector_eqv1ti (operands[0], operands[1], operands[2]));
> + emit_insn (gen_one_cmplv1ti2 (operands[0], operands[0]));
> + DONE;
> +})
> +
> +
nit: extra line.
> ;; Vector Compare Not Equal or Zero Word
> (define_insn "vcmpnezw"
> [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index cb501ab2d75..346885de545 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -21270,6 +21270,180 @@ Generate PCV from specified Mask size, as if implemented by the
> immediate value is either 0, 1, 2 or 3.
> @findex vec_genpcvm
>
> +@smallexample
> +@exdent vector unsigned __int128 vec_rl (vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_rl (vector signed __int128,
> + vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of rotating the first input left by the number of bits
> +specified in the most significant quad word of the second input truncated to
> +7 bits (bits [125:131]).
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_rlmi (vector unsigned __int128,
> + vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_rlmi (vector signed __int128,
> + vector signed __int128,
> + vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of rotating the first input and inserting it under mask into the
> +second input. The first bit in the mask, the last bit in the mask are obtained from the
> +two 7-bit fields bits [108:115] and bits [117:123] respectively of the second input.
> +The shift is obtained from the third input in the 7-bit field [125:131] where all bits
> +counted from zero at the left.
I initially had a comment here, but after a re-read I think this is OK.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_rlnm (vector unsigned __int128,
> + vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_rlnm (vector signed __int128,
> + vector unsigned __int128,
> + vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of rotating the first input and ANDing it with a mask. The first
> +bit in the mask, the last bit in the mask and the shift amount are obtained from the two
> +7-bit fields bits [117:123] and bits [125:131] respectively of the second input.
> +The shift is obtained from the third input in the 7-bit field bits [125:131] where all
> +bits counted from zero at the left.
Shift amount reference in second sentence read clunky, should be
adjusted wrt third sentence.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_sl(vector unsigned __int128, vector unsigned __int128);
> +@exdent vector signed __int128 vec_sl(vector signed __int128, vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of shifting the first input left by the number of bits
> +specified in the most significant bits of the second input truncated to
> +7 bits (bits [125:131]).
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_sr(vector unsigned __int128, vector unsigned __int128);
> +@exdent vector signed __int128 vec_sr(vector signed __int128, vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of performing a logical right shift of the first argument
> +by the number of bits specified in the most significant double word of the
> +second input truncated to 7 bits (bits [125:131]).
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_sra(vector unsigned __int128, vector unsigned __int128);
> +@exdent vector signed __int128 vec_sra(vector signed __int128, vector unsigned __int128);
> +@end smallexample
> +
> +Returns the result of performing arithmetic right shift of the first argument
> +by the number of bits specified in the most significant bits of the
> +second input truncated to 7 bits (bits [125:131]).
> +
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_mule (vector unsigned long long,
> + vector unsigned long long);
> +@exdent vector signed __int128 vec_mule (vector signed long long,
> + vector signed long long);
> +@end smallexample
> +
> +Returns a vector containing a 128-bit integer result of multiplying the even doubleword
> +elements of the two inputs.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_mulo (vector unsigned long long,
> + vector unsigned long long);
> +@exdent vector signed __int128 vec_mulo (vector signed long long,
> + vector signed long long);
> +@end smallexample
> +
> +Returns a vector containing a 128-bit integer result of multiplying the odd doubleword
> +elements of the two inputs.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_div (vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_div (vector signed __int128,
> + vector signed __int128);
> +@end smallexample
> +
> +Returns the result of dividing the first operand by the second operand. An attempt to
> +divide any value by zero or to divide the most negative signed 128-bit integer by
> +negative one results in an undefined value.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_dive (vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_dive (vector signed __int128,
> + vector signed __int128);
> +@end smallexample
> +
> +The result is produced by shifting the first input left by 128 bits and dividing by the
> +second. If an attempt is made to divide by zero or the result is larger than 128 bits,
> +the result is undefined.
> +
> +@smallexample
> +@exdent vector unsigned __int128 vec_mod (vector unsigned __int128,
> + vector unsigned __int128);
> +@exdent vector signed __int128 vec_mod (vector signed __int128,
> + vector signed __int128);
> +@end smallexample
> +
> +The result is the modulo result of dividing the first input by the second input.
> +
> +
> +The following builtins perform 128-bit vector comparisons. The @code{vec_all_xx},
> +@code{vec_any_xx}, and @code{vec_cmpxx}, where @code{xx} is one of the operations
> +@code{eq, ne, gt, lt, ge, le} perform pairwise comparisons between the elements
> +at the same positions within their two vector arguments. The @code{vec_all_xx}
> +function returns a non-zero value if and only if all pairwise comparisons are true. The
> +@code{vec_any_xx} function returns a non-zero value if and only if at least one pairwise
> +comparison is true. The @code{vec_cmpxx}function returns a vector of the same type as its
> +two arguments, within which each element consists of all ones to denote that specified
> +logical comparison of the corresponding elements was true. Otherwise, the element of the
> +returned vector contains all zeros.
> +
> +@smallexample
> +vector bool __int128 vec_cmpeq (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmpeq (vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_cmpne (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmpne (vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_cmpgt (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmpgt (vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_cmplt (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmplt (vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_cmpge (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmpge (vector unsigned __int128, vector unsigned __int128);
> +vector bool __int128 vec_cmple (vector signed __int128, vector signed __int128);
> +vector bool __int128 vec_cmple (vector unsigned __int128, vector unsigned __int128);
> +
> +int vec_all_eq (vector signed __int128, vector signed __int128);
> +int vec_all_eq (vector unsigned __int128, vector unsigned __int128);
> +int vec_all_ne (vector signed __int128, vector signed __int128);
> +int vec_all_ne (vector unsigned __int128, vector unsigned __int128);
> +int vec_all_gt (vector signed __int128, vector signed __int128);
> +int vec_all_gt (vector unsigned __int128, vector unsigned __int128);
> +int vec_all_lt (vector signed __int128, vector signed __int128);
> +int vec_all_lt (vector unsigned __int128, vector unsigned __int128);
> +int vec_all_ge (vector signed __int128, vector signed __int128);
> +int vec_all_ge (vector unsigned __int128, vector unsigned __int128);
> +int vec_all_le (vector signed __int128, vector signed __int128);
> +int vec_all_le (vector unsigned __int128, vector unsigned __int128);
> +
> +int vec_any_eq (vector signed __int128, vector signed __int128);
> +int vec_any_eq (vector unsigned __int128, vector unsigned __int128);
> +int vec_any_ne (vector signed __int128, vector signed __int128);
> +int vec_any_ne (vector unsigned __int128, vector unsigned __int128);
> +int vec_any_gt (vector signed __int128, vector signed __int128);
> +int vec_any_gt (vector unsigned __int128, vector unsigned __int128);
> +int vec_any_lt (vector signed __int128, vector signed __int128);
> +int vec_any_lt (vector unsigned __int128, vector unsigned __int128);
> +int vec_any_ge (vector signed __int128, vector signed __int128);
> +int vec_any_ge (vector unsigned __int128, vector unsigned __int128);
> +int vec_any_le (vector signed __int128, vector signed __int128);
> +int vec_any_le (vector unsigned __int128, vector unsigned __int128);
> +@end smallexample
> +
> +
> @node PowerPC Hardware Transactional Memory Built-in Functions
> @subsection PowerPC Hardware Transactional Memory Built-in Functions
> GCC provides two interfaces for accessing the Hardware Transactional
> diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> new file mode 100644
> index 00000000000..c84494fc28d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> @@ -0,0 +1,2254 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target power10_hw } */
> +/* { dg-options "-mdejagnu-cpu=power10" } */
> +
> +
> +/* Check that the expected 128-bit instructions are generated if the processor
> + supports the 128-bit integer instructions. */
> +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvslq\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvsrq\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvsraq\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvrlq\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpuq\M} 0 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpsq\M} 0 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 0 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpequq.\M} 16 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 0 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtsq.\M} 16 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 0 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtuq.\M} 16 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmuleud\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmuloud\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmulesd\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmulosd\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvdivsq\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvdivuq\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvdivesq\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvdiveuq\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmodsq\M} 1 { target { ppc_native_128bit } } } } */
> +/* { dg-final { scan-assembler-times {\mvmoduq\M} 1 { target { ppc_native_128bit } } } } */
Since it's on all of the clauses, Maybe adjust the dg-require to
include ppc_native_128bit for the whole test, unless there is more to
follow.
No other comments,..
Thanks
-Will
- Previous message (by thread): [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare
- Next message (by thread): [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the Gcc-patches
mailing list