[PATCH], Add PowerPC ISA 3.0 vector count trailing zeros and vector parity support

Michael Meissner meissner@linux.vnet.ibm.com
Wed May 25 01:27:00 GMT 2016


This patch adds support for two sets of new instructions in ISA 3.0, vector
count trailing zeros, and vector parity.  In addition, it defines many of the
support macros that will be used by other built-in functions that will be added
shortly.

I have bootstraped this and there were no regressions.  Is it ok to apply to
the trunk?  Assuming it is ok to apply to the trunk, is it ok to back port to
the GCC 6.2 branch?

[gcc]
2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (VParity): New mode iterator for vector
	parity built-in functions.
	(p9v_ctz<mode>2): Add support for ISA 3.0 vector count trailing
	zeros.
	(p9v_parity<mode>2): Likewise.
	* config/rs6000/vector.md (VEC_IP): New mode iterator for vector
	parity.
	(ctz<mode>2): ISA 3.0 expander for vector count trailing zeros.
	(parity<mode>2): ISA 3.0 expander for vector parity.
	* config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for
	power9 built-ins.
	(BU_P9_64BIT_MISC_0): Likewise.
	(BU_P9_MISC_0): Likewise.
	(BU_P9V_AV_1): Likewise.
	(BU_P9V_AV_2): Likewise.
	(BU_P9V_AV_3): Likewise.
	(BU_P9V_AV_P): Likewise.
	(BU_P9V_VSX_1): Likewise.
	(BU_P9V_OVERLOAD_1): Likewise.
	(BU_P9V_OVERLOAD_2): Likewise.
	(BU_P9V_OVERLOAD_3): Likewise.
	(VCTZB): Add vector count trailing zeros support.
	(VCTZH): Likewise.
	(VCTZW): Likewise.
	(VCTZD): Likewise.
	(VPRTYBD): Add vector parity support.
	(VPRTYBQ): Likewise.
	(VPRTYBW): Likewise.
	(VCTZ): Add overloaded vector count trailing zeros support.
	(VPRTYB): Add overloaded vector parity support.
	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
	overloaded vector count trailing zeros and parity instructions.
	* config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for
	vector parity support.
	* config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count
	trailing zeros support.
	(vec_cntlz): Likewise.
	(vec_vctzb): Likewise.
	(vec_vctzd): Likewise.
	(vec_vctzh): Likewise.
	(vec_vctzw): Likewise.
	(vec_vprtyb): Add ISA 3.0 vector parity support.
	(vec_vprtybd): Likewise.
	(vec_vprtybw): Likewise.
	(vec_vprtybq): Likewise.
	* doc/extend.texi (PowerPC AltiVec Built-in Functions): Document
	the ISA 3.0 vector count trailing zeros and vector parity built-in
	functions.

[gcc/testsuite]
2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-vparity.c: New file to check SIA 3.0
	vector parity built-in functions.
	* gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector
	count trailing zeros automatic vectorization.
	* gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector
	count trailing zeros built-in functions.



-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
-------------- next part --------------
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/altivec.md	(.../gcc/config/rs6000)	(working copy)
@@ -193,6 +193,13 @@ (define_mode_iterator VM2 [V4SI
 			   (KF "FLOAT128_VECTOR_P (KFmode)")
 			   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
+;; Specific iterator for parity which does not have a byte/half-word form, but
+;; does have a quad word form
+(define_mode_iterator VParity [V4SI
+			       V2DI
+			       V1TI
+			       (TI "TARGET_VSX_TIMODE")])
+
 (define_mode_attr VI_char [(V2DI "d") (V4SI "w") (V8HI "h") (V16QI "b")])
 (define_mode_attr VI_scalar [(V2DI "DI") (V4SI "SI") (V8HI "HI") (V16QI "QI")])
 (define_mode_attr VI_unit [(V16QI "VECTOR_UNIT_ALTIVEC_P (V16QImode)")
@@ -3415,7 +3422,7 @@ (define_expand "vec_unpacku_float_lo_v8h
 }")
 
 
-;; Power8 vector instructions encoded as Altivec instructions
+;; Power8/power9 vector instructions encoded as Altivec instructions
 
 ;; Vector count leading zeros
 (define_insn "*p8v_clz<mode>2"
@@ -3426,6 +3433,15 @@ (define_insn "*p8v_clz<mode>2"
   [(set_attr "length" "4")
    (set_attr "type" "vecsimple")])
 
+;; Vector count trailing zeros
+(define_insn "*p9v_ctz<mode>2"
+  [(set (match_operand:VI2 0 "register_operand" "=v")
+	(ctz:VI2 (match_operand:VI2 1 "register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vctz<wd> %0,%1"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
 ;; Vector population count
 (define_insn "*p8v_popcount<mode>2"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -3435,6 +3451,15 @@ (define_insn "*p8v_popcount<mode>2"
   [(set_attr "length" "4")
    (set_attr "type" "vecsimple")])
 
+;; Vector parity
+(define_insn "*p9v_parity<mode>2"
+  [(set (match_operand:VParity 0 "register_operand" "=v")
+        (parity:VParity (match_operand:VParity 1 "register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vprtyb<wd> %0,%1"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
 ;; Vector Gather Bits by Bytes by Doubleword
 (define_insn "p8v_vgbbd"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/vector.md	(.../gcc/config/rs6000)	(working copy)
@@ -26,6 +26,13 @@
 ;; Vector int modes
 (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
 
+;; Vector int modes for parity
+(define_mode_iterator VEC_IP [V8HI
+			      V4SI
+			      V2DI
+			      V1TI
+			      (TI "TARGET_VSX_TIMODE")])
+
 ;; Vector float modes
 (define_mode_iterator VEC_F [V4SF V2DF])
 
@@ -752,12 +759,24 @@ (define_expand "clz<mode>2"
 	(clz:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
   "TARGET_P8_VECTOR")
 
+;; Vector count trailing zeros
+(define_expand "ctz<mode>2"
+  [(set (match_operand:VEC_I 0 "register_operand" "")
+	(ctz:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
+  "TARGET_P9_VECTOR")
+
 ;; Vector population count
 (define_expand "popcount<mode>2"
   [(set (match_operand:VEC_I 0 "register_operand" "")
         (popcount:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
   "TARGET_P8_VECTOR")
 
+;; Vector parity
+(define_expand "parity<mode>2"
+  [(set (match_operand:VEC_IP 0 "register_operand" "")
+	(parity:VEC_IP (match_operand:VEC_IP 1 "register_operand" "")))]
+  "TARGET_P9_VECTOR")
+
 
 ;; Same size conversions
 (define_expand "float<VEC_int><mode>2"
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/rs6000-builtin.def	(.../gcc/config/rs6000)	(working copy)
@@ -687,8 +687,113 @@
 		     | RS6000_BTC_BINARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
 
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers.  */
+#define BU_P9_MISC_1(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_1 (MISC_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_MODULO,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers,
+   and they require 64-bit addressing.  */
+#define BU_P9_64BIT_MISC_0(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_0 (MISC_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_MODULO                                   \
+                     | RS6000_BTM_64BIT,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_SPECIAL),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers.  */
+#define BU_P9_MISC_0(ENUM, NAME, ATTR, ICODE)                      \
+  RS6000_BUILTIN_0 (MISC_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_MODULO,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_SPECIAL),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+/* ISA 3.0 (power9) vector convenience macros.  */
+/* For the instructions that are encoded as altivec instructions use
+   __builtin_altivec_ as the builtin name.  */
+#define BU_P9V_AV_1(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_1 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_P9V_AV_2(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_2 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_P9V_AV_3(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_3 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_TERNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_P9V_AV_P(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_P (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_PREDICATE),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+/* For the instructions encoded as VSX instructions use __builtin_vsx as the
+   builtin name.  */
+#define BU_P9V_VSX_1(ENUM, NAME, ATTR, ICODE)				\
+  RS6000_BUILTIN_1 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_P9V_OVERLOAD_1(ENUM, NAME)					\
+  RS6000_BUILTIN_1 (P9V_BUILTIN_VEC_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_OVERLOADED		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
+#define BU_P9V_OVERLOAD_2(ENUM, NAME)					\
+  RS6000_BUILTIN_2 (P9V_BUILTIN_VEC_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_OVERLOADED		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
+#define BU_P9V_OVERLOAD_3(ENUM, NAME)					\
+  RS6000_BUILTIN_3 (P9V_BUILTIN_VEC_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_OVERLOADED		/* ATTR */	\
+		     | RS6000_BTC_TERNARY),				\
+		    CODE_FOR_nothing)			/* ICODE */
 #endif
 
+
 /* Insure 0 is not a legitimate index.  */
 BU_SPECIAL_X (RS6000_BUILTIN_NONE, NULL, 0, RS6000_BTC_MISC)
 
@@ -1704,6 +1809,26 @@ BU_LDBL128_2 (UNPACK_TF,	"unpack_longdou
 BU_P7_MISC_2 (PACK_V1TI,	"pack_vector_int128",	CONST,	packv1ti)
 BU_P7_MISC_2 (UNPACK_V1TI,	"unpack_vector_int128",	CONST,	unpackv1ti)
 
+/* 1 argument vector functions added in ISA 3.0 (power9).  */
+BU_P9V_AV_1 (VCTZB,		"vctzb",		CONST,  ctzv16qi2)
+BU_P9V_AV_1 (VCTZH,		"vctzh",		CONST,  ctzv8hi2)
+BU_P9V_AV_1 (VCTZW,		"vctzw",		CONST,  ctzv4si2)
+BU_P9V_AV_1 (VCTZD,		"vctzd",		CONST,  ctzv2di2)
+BU_P9V_AV_1 (VPRTYBD,		"vprtybd",		CONST,  parityv2di2)
+BU_P9V_AV_1 (VPRTYBQ,		"vprtybq",		CONST,  parityv1ti2)
+BU_P9V_AV_1 (VPRTYBW,		"vprtybw",		CONST,  parityv4si2)
+
+/* ISA 3.0 vector overloaded 1 argument functions.  */
+BU_P9V_OVERLOAD_1 (VCTZ,	"vctz")
+BU_P9V_OVERLOAD_1 (VCTZB,	"vctzb")
+BU_P9V_OVERLOAD_1 (VCTZH,	"vctzh")
+BU_P9V_OVERLOAD_1 (VCTZW,	"vctzw")
+BU_P9V_OVERLOAD_1 (VCTZD,	"vctzd")
+BU_P9V_OVERLOAD_1 (VPRTYB,	"vprtyb")
+BU_P9V_OVERLOAD_1 (VPRTYBD,	"vprtybd")
+BU_P9V_OVERLOAD_1 (VPRTYBQ,	"vprtybq")
+BU_P9V_OVERLOAD_1 (VPRTYBW,	"vprtybw")
+
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox)
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/rs6000-c.c	(.../gcc/config/rs6000)	(working copy)
@@ -4210,6 +4210,43 @@ const struct altivec_builtin_types altiv
   { P8V_BUILTIN_VEC_VCLZD, P8V_BUILTIN_VCLZD,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
 
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZB, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZB, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZH, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZH, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZW, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZW, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZD, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZD, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
@@ -4339,6 +4376,42 @@ const struct altivec_builtin_types altiv
   { P8V_BUILTIN_VEC_VPOPCNTD, P8V_BUILTIN_VPOPCNTD,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
 
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_INTTI, RS6000_BTI_INTTI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_UINTTI, RS6000_BTI_UINTTI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBW, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBW, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBD, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBD, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_INTTI, RS6000_BTI_INTTI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_UINTTI, RS6000_BTI_UINTTI, 0, 0 },
+
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
     RS6000_BTI_V4SI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -577,7 +577,9 @@ (define_mode_attr wd [(QI    "b")
 		      (V16QI "b")
 		      (V8HI  "h")
 		      (V4SI  "w")
-		      (V2DI  "d")])
+		      (V2DI  "d")
+		      (V1TI  "q")
+		      (TI    "q")])
 
 ;; How many bits in this mode?
 (define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")])
Index: gcc/config/rs6000/altivec.h
===================================================================
--- gcc/config/rs6000/altivec.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 236663)
+++ gcc/config/rs6000/altivec.h	(.../gcc/config/rs6000)	(working copy)
@@ -384,6 +384,23 @@
 #define vec_vupklsw __builtin_vec_vupklsw
 #endif
 
+#ifdef _ARCH_PWR9
+/* Vector additions added in ISA 3.0.  */
+#define vec_vctz __builtin_vec_vctz
+#define vec_cntlz __builtin_vec_vctz
+#define vec_vctzb __builtin_vec_vctzb
+#define vec_vctzd __builtin_vec_vctzd
+#define vec_vctzh __builtin_vec_vctzh
+#define vec_vctzw __builtin_vec_vctzw
+#define vec_vprtyb __builtin_vec_vprtyb
+#define vec_vprtybd __builtin_vec_vprtybd
+#define vec_vprtybw __builtin_vec_vprtybw
+
+#ifdef _ARCH_PPC64
+#define vec_vprtybq __builtin_vec_vprtybq
+#endif
+#endif
+
 /* Predicates.
    For C++, we use templates in order to allow non-parenthesized arguments.
    For C, instead, we use macros since non-parenthesized arguments were
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 236663)
+++ gcc/doc/extend.texi	(.../gcc/doc)	(working copy)
@@ -17287,6 +17287,60 @@ int __builtin_bcdsub_gt (vector __int128
 int __builtin_bcdsub_ov (vector __int128_t, vector__int128_t);
 @end smallexample
 
+If the ISA 3.00 additions to the vector/scalar (power9-vector)
+instruction set are available:
+
+@smallexample
+vector long long vec_vctz (vector long long);
+vector unsigned long long vec_vctz (vector unsigned long long);
+vector int vec_vctz (vector int);
+vector unsigned int vec_vctz (vector int);
+vector short vec_vctz (vector short);
+vector unsigned short vec_vctz (vector unsigned short);
+vector signed char vec_vctz (vector signed char);
+vector unsigned char vec_vctz (vector unsigned char);
+
+vector signed char vec_vctzb (vector signed char);
+vector unsigned char vec_vctzb (vector unsigned char);
+
+vector long long vec_vctzd (vector long long);
+vector unsigned long long vec_vctzd (vector unsigned long long);
+
+vector short vec_vctzh (vector short);
+vector unsigned short vec_vctzh (vector unsigned short);
+
+vector int vec_vctzw (vector int);
+vector unsigned int vec_vctzw (vector int);
+
+vector int vec_vprtyb (vector int);
+vector unsigned int vec_vprtyb (vector unsigned int);
+vector long long vec_vprtyb (vector long long);
+vector unsigned long long vec_vprtyb (vector unsigned long long);
+
+vector int vec_vprtybw (vector int);
+vector unsigned int vec_vprtybw (vector unsigned int);
+
+vector long long vec_vprtybd (vector long long);
+vector unsigned long long vec_vprtybd (vector unsigned long long);
+@end smallexample
+
+
+If the ISA 3.00 additions to the vector/scalar (power9-vector)
+instruction set are available for 64-bit targets:
+
+@smallexample
+vector long vec_vprtyb (vector long);
+vector unsigned long vec_vprtyb (vector unsigned long);
+vector __int128_t vec_vprtyb (vector __int128_t);
+vector __uint128_t vec_vprtyb (vector __uint128_t);
+
+vector long vec_vprtybd (vector long);
+vector unsigned long vec_vprtybd (vector unsigned long);
+
+vector __int128_t vec_vprtybq (vector __int128_t);
+vector __uint128_t vec_vprtybd (vector __uint128_t);
+@end smallexample
+
 If the cryptographic instructions are enabled (@option{-mcrypto} or
 @option{-mcpu=power8}), the following builtins are enabled.
 
Index: gcc/testsuite/gcc.target/powerpc/p9-vparity.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-vparity.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vparity.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 236664)
@@ -0,0 +1,107 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2 -mlra -mvsx-timode" } */
+
+#include <altivec.h>
+
+vector int
+parity_v4si_1s (vector int a)
+{
+  return vec_vprtyb (a);
+}
+
+vector int
+parity_v4si_2s (vector int a)
+{
+  return vec_vprtybw (a);
+}
+
+vector unsigned int
+parity_v4si_1u (vector unsigned int a)
+{
+  return vec_vprtyb (a);
+}
+
+vector unsigned int
+parity_v4si_2u (vector unsigned int a)
+{
+  return vec_vprtybw (a);
+}
+
+vector long long
+parity_v2di_1s (vector long long a)
+{
+  return vec_vprtyb (a);
+}
+
+vector long long
+parity_v2di_2s (vector long long a)
+{
+  return vec_vprtybd (a);
+}
+
+vector unsigned long long
+parity_v2di_1u (vector unsigned long long a)
+{
+  return vec_vprtyb (a);
+}
+
+vector unsigned long long
+parity_v2di_2u (vector unsigned long long a)
+{
+  return vec_vprtybd (a);
+}
+
+vector __int128_t
+parity_v1ti_1s (vector __int128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+vector __int128_t
+parity_v1ti_2s (vector __int128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+__int128_t
+parity_ti_3s (__int128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+__int128_t
+parity_ti_4s (__int128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+vector __uint128_t
+parity_v1ti_1u (vector __uint128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+vector __uint128_t
+parity_v1ti_2u (vector __uint128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+__uint128_t
+parity_ti_3u (__uint128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+__uint128_t
+parity_ti_4u (__uint128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+/* { dg-final { scan-assembler "vprtybd" } } */
+/* { dg-final { scan-assembler "vprtybq" } } */
+/* { dg-final { scan-assembler "vprtybw" } } */
Index: gcc/testsuite/gcc.target/powerpc/ctz-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ctz-3.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ctz-3.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 236664)
@@ -0,0 +1,62 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2 -ftree-vectorize -fvect-cost-model=dynamic -fno-unroll-loops -fno-unroll-all-loops" } */
+
+#ifndef SIZE
+#define SIZE 1024
+#endif
+
+#ifndef ALIGN
+#define ALIGN 32
+#endif
+
+#define ALIGN_ATTR __attribute__((__aligned__(ALIGN)))
+
+#define DO_BUILTIN(PREFIX, TYPE, CTZ)					\
+TYPE PREFIX ## _a[SIZE] ALIGN_ATTR;					\
+TYPE PREFIX ## _b[SIZE] ALIGN_ATTR;					\
+									\
+void									\
+PREFIX ## _ctz (void)							\
+{									\
+  unsigned long i;							\
+									\
+  for (i = 0; i < SIZE; i++)						\
+    PREFIX ## _a[i] = CTZ (PREFIX ## _b[i]);				\
+}
+
+#if !defined(DO_LONG_LONG) && !defined(DO_LONG) && !defined(DO_INT) && !defined(DO_SHORT) && !defined(DO_CHAR)
+#define DO_INT 1
+#endif
+
+#if DO_LONG_LONG
+/* At the moment, only int is auto vectorized.  */
+DO_BUILTIN (sll, long long,		__builtin_ctzll)
+DO_BUILTIN (ull, unsigned long long,	__builtin_ctzll)
+#endif
+
+#if defined(_ARCH_PPC64) && DO_LONG
+DO_BUILTIN (sl,  long,			__builtin_ctzl)
+DO_BUILTIN (ul,  unsigned long,		__builtin_ctzl)
+#endif
+
+#if DO_INT
+DO_BUILTIN (si,  int,			__builtin_ctz)
+DO_BUILTIN (ui,  unsigned int,		__builtin_ctz)
+#endif
+
+#if DO_SHORT
+DO_BUILTIN (ss,  short,			__builtin_ctz)
+DO_BUILTIN (us,  unsigned short,	__builtin_ctz)
+#endif
+
+#if DO_CHAR
+DO_BUILTIN (sc,  signed char,		__builtin_ctz)
+DO_BUILTIN (uc,  unsigned char,		__builtin_ctz)
+#endif
+
+/* { dg-final { scan-assembler-times "vctzw" 2 } } */
+/* { dg-final { scan-assembler-not "cnttzd" } } */
+/* { dg-final { scan-assembler-not "cnttzw" } } */
Index: gcc/testsuite/gcc.target/powerpc/ctz-4.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ctz-4.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ctz-4.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 236664)
@@ -0,0 +1,110 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+
+vector signed char
+count_trailing_zeros_v16qi_1s (vector signed char a)
+{
+  return vec_vctz (a);
+}
+
+vector signed char
+count_trailing_zeros_v16qi_2s (vector signed char a)
+{
+  return vec_vctzb (a);
+}
+
+vector unsigned char
+count_trailing_zeros_v16qi_1u (vector unsigned char a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned char
+count_trailing_zeros_v16qi_2u (vector unsigned char a)
+{
+  return vec_vctzb (a);
+}
+
+vector short
+count_trailing_zeros_v8hi_1s (vector short a)
+{
+  return vec_vctz (a);
+}
+
+vector short
+count_trailing_zeros_v8hi_2s (vector short a)
+{
+  return vec_vctzh (a);
+}
+
+vector unsigned short
+count_trailing_zeros_v8hi_1u (vector unsigned short a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned short
+count_trailing_zeros_v8hi_2u (vector unsigned short a)
+{
+  return vec_vctzh (a);
+}
+
+vector int
+count_trailing_zeros_v4si_1s (vector int a)
+{
+  return vec_vctz (a);
+}
+
+vector int
+count_trailing_zeros_v4si_2s (vector int a)
+{
+  return vec_vctzw (a);
+}
+
+vector unsigned int
+count_trailing_zeros_v4si_1u (vector unsigned int a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned int
+count_trailing_zeros_v4si_2u (vector unsigned int a)
+{
+  return vec_vctzw (a);
+}
+
+vector long long
+count_trailing_zeros_v2di_1s (vector long long a)
+{
+  return vec_vctz (a);
+}
+
+vector long long
+count_trailing_zeros_v2di_2s (vector long long a)
+{
+  return vec_vctzd (a);
+}
+
+vector unsigned long long
+count_trailing_zeros_v2di_1u (vector unsigned long long a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned long long
+count_trailing_zeros_v2di_2u (vector unsigned long long a)
+{
+  return vec_vctzd (a);
+}
+
+/* { dg-final { scan-assembler "vctzb" } } */
+/* { dg-final { scan-assembler "vctzd" } } */
+/* { dg-final { scan-assembler "vctzh" } } */
+/* { dg-final { scan-assembler "vctzw" } } */
+/* { dg-final { scan-assembler-not "cnttzd" } } */
+/* { dg-final { scan-assembler-not "cnttzw" } } */


More information about the Gcc-patches mailing list