This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH v2, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins

From: "Carl E. Love" <cel at us dot ibm dot com>
To: Segher Boessenkool <segher at kernel dot crashing dot org>
Cc: gcc-patches at gcc dot gnu dot org, David Edelsohn <dje dot gcc at gmail dot com>, Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
Date: Fri, 09 Jun 2017 16:12:25 -0700
Subject: Re: [PATCH v2, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
Authentication-results: sourceware.org; auth=none
References: <1497032443.4252.59.camel@us.ibm.com> <20170609210505.GM19687@gate.crashing.org>
GCC Maintainers:

On Fri, 2017-06-09 at 16:05 -0500, Segher Boessenkool wrote:

Fixed the various formatting (spaces) issues.  Been toying with how to
write a space checker for patches.  Have to take some time to really
think about how to do that....

> > +
> > +  /* The vector merge instruction vmrgew swaps the 2nd and 3rd words,
> > +     compensate by swapping the 64-bit elements around to negate the vmrgew
> > +     swap. */
> 
> This comment isn't very clear to me...  Could you expand it a bit?

Reworked it, hopefully it explains things better

> > +;; Mode iterator and attribute for vector floate and floato conversions
> > +(define_mode_iterator VFC [V2DI V2DF])
> > +(define_mode_attr VFC_inst [(V2DI "sxd") (V2DF "dp")])
> 
> .._sxddp, like VS_sxswp in altivec.md?  The iterator is just VSX_D.
> 
> Maybe some or all of these iterators/attrs should live in vector.md?
> 
> Is it really useful to have separate files altivec.md and vsx.md anymore?
> Or should some things be moved?  This is a general question, not really
> something to be handled in this patch ;-)
> 

Yea, I find searching through all the files to rather hard.  Perhaps
putting all the definitions into a single "header" file?  That way it
could span across all of the .md files.  If you combined vsx.md and
altivec.md, you would have a really large file.  Big files can be
problematic in their own right.

> > +(define_insn "vsx_xvcvsxwsp"
> > +  [(set (match_operand:V4SF 0 "vsx_register_operand" "=v")
> > +	(unspec:V4SF[(match_operand:V4SI 1 "vsx_register_operand" "v")]
> > +		    UNSPEC_VSX_CVSXWSP))]
> > +  "VECTOR_UNIT_VSX_P (V4SFmode)"
> > +  "xvcvsxwsp %x0,%x1"
> > +  [(set_attr "type" "vecdouble")])
> 
> "v" is only the VRs...  Do you want "wa" or similar instead?
> 

I went back and re-studied the Power register constrains.  I find them a
bit confusing, I am sure they are perfectly clear to everyone else.  So
the instructions all take VSX registers so "wa" should be fine if I
understand it correctly.  Not sure there is any need to further
constrain with "vs" for doubles or "ww" but I think you could.

I retested the changes on powerpc64le-unknown-linux-gnu (Power 8 LE)
only. 

Please let me know if the updated patch is OK for gcc mainline?

                             Carl Love
-------------------------------------------------------------------
>From 3378d779286284183a4dc30a7a5dd10fa30671ff Mon Sep 17 00:00:00 2001
From: Carl Love <carll@us.ibm.com>
Date: Fri, 9 Jun 2017 17:58:23 -0500
Subject: [PATCH] Add vec_float, vec_float2, vec_floate, vec_floate, builtin
 support.

gcc/ChangeLog:

2017-06-09  Carl Love  <cel@us.ibm.com>

	* config/rs6000/rs6000-c.c: Add definitions for the vec_float,
	vec_float2, vec_floato, vec_floate built-ins.
	* config/rs6000/vsx.md: Add RTL code for instructions vsx_xvcvsxws
	vsx_xvcvuxwsp, float2, floato and floate.
	* config/rs6000/rs6000-builtin.def: Add definitions for vsx_xvcvsxwsp,
	vsx_xvcvuxwsp, float2, floato and floate.
	* config/altivec.md: Add version of p8_vmrgew that takes V4SF args and
	returns V4SF.
	* config/rs6000/altivec.h: Add builtin defines for vec_float,
	vec_float2, vec_floate and vec_floato.
	* doc/extend.texi: Update the built-in documentation file for the
	new built-in functions.

gcc/testsuite/ChangeLog:

2017-06-09  Carl Love  <cel@us.ibm.com>

	* gcc.target/powerpc/builtins-3-runnable.c: Add runnable tests for
	vec_float, vec_float2, vec_floate and vec_floato builtins
	built-ins.

Signed-off-by: Carl Love <carll@us.ibm.com>
---
 gcc/config/rs6000/altivec.h                        |   4 +
 gcc/config/rs6000/altivec.md                       |  14 +-
 gcc/config/rs6000/rs6000-builtin.def               |  19 ++-
 gcc/config/rs6000/rs6000-c.c                       |  28 +++-
 gcc/config/rs6000/rs6000-protos.h                  |   1 +
 gcc/config/rs6000/rs6000.c                         |  45 +++++-
 gcc/config/rs6000/vsx.md                           | 175 +++++++++++++++++++++
 gcc/doc/extend.texi                                |  14 ++
 .../gcc.target/powerpc/builtins-3-runnable.c       |  82 ++++++++++
 9 files changed, 370 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 20050eb..d542315 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -133,6 +133,10 @@
 #define vec_doublel __builtin_vec_doublel
 #define vec_doubleh __builtin_vec_doubleh
 #define vec_expte __builtin_vec_expte
+#define vec_float __builtin_vec_float
+#define vec_float2 __builtin_vec_float2
+#define vec_floate __builtin_vec_floate
+#define vec_floato __builtin_vec_floato
 #define vec_floor __builtin_vec_floor
 #define vec_loge __builtin_vec_loge
 #define vec_madd __builtin_vec_madd
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 487b9a4..25b2768 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1316,13 +1316,13 @@
 }
   [(set_attr "type" "vecperm")])
 
-;; Power8 vector merge even/odd
-(define_insn "p8_vmrgew"
-  [(set (match_operand:V4SI 0 "register_operand" "=v")
-	(vec_select:V4SI
-	  (vec_concat:V8SI
-	    (match_operand:V4SI 1 "register_operand" "v")
-	    (match_operand:V4SI 2 "register_operand" "v"))
+;; Power8 vector merge two V4SF/V4SI even words to V4SF
+(define_insn "p8_vmrgew_<mode>"
+  [(set (match_operand:VSX_W 0 "register_operand" "=v")
+	(vec_select:VSX_W
+	  (vec_concat:<VS_double>
+	    (match_operand:VSX_W 1 "register_operand" "v")
+	    (match_operand:VSX_W 2 "register_operand" "v"))
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 2) (const_int 6)])))]
   "TARGET_P8_VECTOR"
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 241c439..4682628 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1591,6 +1591,8 @@ BU_VSX_2 (CMPLE_U16QI,        "cmple_u16qi",    CONST,  vector_ngtuv16qi)
 BU_VSX_2 (CMPLE_U8HI,         "cmple_u8hi",     CONST,  vector_ngtuv8hi)
 BU_VSX_2 (CMPLE_U4SI,         "cmple_u4si",     CONST,  vector_ngtuv4si)
 BU_VSX_2 (CMPLE_U2DI,         "cmple_u2di",     CONST,  vector_ngtuv2di)
+BU_VSX_2 (FLOAT2_V2DI,        "float2_v2di",    CONST,  float2_v2di)
+BU_VSX_2 (UNS_FLOAT2_V2DI,    "uns_float2_v2di",    CONST,  uns_float2_v2di)
 
 /* VSX abs builtin functions.  */
 BU_VSX_A (XVABSDP,	      "xvabsdp",	CONST,	absv2df2)
@@ -1648,6 +1650,16 @@ BU_VSX_1 (XVCVSPSXDS,	      "xvcvspsxds",	CONST,	vsx_xvcvspsxds)
 BU_VSX_1 (XVCVSPUXDS,	      "xvcvspuxds",	CONST,	vsx_xvcvspuxds)
 BU_VSX_1 (XVCVSXDSP,	      "xvcvsxdsp",	CONST,	vsx_xvcvsxdsp)
 BU_VSX_1 (XVCVUXDSP,	      "xvcvuxdsp",	CONST,	vsx_xvcvuxdsp)
+
+BU_VSX_1 (XVCVSXWSP_V4SF,  "vsx_xvcvsxwsp",   CONST,	vsx_xvcvsxwsp)
+BU_VSX_1 (XVCVUXWSP_V4SF,  "vsx_xvcvuxwsp",   CONST,	vsx_xvcvuxwsp)
+BU_VSX_1 (FLOATE_V2DI,     "floate_v2di",     CONST,	floatev2di)
+BU_VSX_1 (FLOATE_V2DF,     "floate_v2df",     CONST,	floatev2df)
+BU_VSX_1 (FLOATO_V2DI,     "floato_v2di",     CONST,	floatov2di)
+BU_VSX_1 (FLOATO_V2DF,     "floato_v2df",     CONST,	floatov2df)
+BU_VSX_1 (UNS_FLOATO_V2DI, "uns_floato_v2di", CONST,	unsfloatov2di)
+BU_VSX_1 (UNS_FLOATE_V2DI, "uns_floate_v2di", CONST,	unsfloatev2di)
+
 BU_VSX_1 (XVRSPI,	      "xvrspi",		CONST,	vsx_xvrspi)
 BU_VSX_1 (XVRSPIC,	      "xvrspic",	CONST,	vsx_xvrspic)
 BU_VSX_1 (XVRSPIM,	      "xvrspim",	CONST,	vsx_floorv4sf2)
@@ -1760,6 +1772,8 @@ BU_VSX_OVERLOAD_2 (XXMRGHW,  "xxmrghw")
 BU_VSX_OVERLOAD_2 (XXMRGLW,  "xxmrglw")
 BU_VSX_OVERLOAD_2 (XXSPLTD,  "xxspltd")
 BU_VSX_OVERLOAD_2 (XXSPLTW,  "xxspltw")
+BU_VSX_OVERLOAD_2 (FLOAT2,   "float2")
+BU_VSX_OVERLOAD_2 (UNS_FLOAT2,   "uns_float2")
 
 /* 1 argument VSX overloaded builtin functions.  */
 BU_VSX_OVERLOAD_1 (DOUBLE,   "double")
@@ -1771,6 +1785,9 @@ BU_VSX_OVERLOAD_1 (DOUBLEH,  "doubleh")
 BU_VSX_OVERLOAD_1 (UNS_DOUBLEH,  "uns_doubleh")
 BU_VSX_OVERLOAD_1 (DOUBLEL,  "doublel")
 BU_VSX_OVERLOAD_1 (UNS_DOUBLEL,  "uns_doublel")
+BU_VSX_OVERLOAD_1 (FLOAT,  "float")
+BU_VSX_OVERLOAD_1 (FLOATE,  "floate")
+BU_VSX_OVERLOAD_1 (FLOATO,  "floato")
 
 /* VSX builtins that are handled as special cases.  */
 BU_VSX_OVERLOAD_X (LD,	     "ld")
@@ -1812,7 +1829,7 @@ BU_P8V_AV_2 (VMINSD,		"vminsd",	CONST,	sminv2di3)
 BU_P8V_AV_2 (VMAXSD,		"vmaxsd",	CONST,	smaxv2di3)
 BU_P8V_AV_2 (VMINUD,		"vminud",	CONST,	uminv2di3)
 BU_P8V_AV_2 (VMAXUD,		"vmaxud",	CONST,	umaxv2di3)
-BU_P8V_AV_2 (VMRGEW,		"vmrgew",	CONST,	p8_vmrgew)
+BU_P8V_AV_2 (VMRGEW_V4SI,	"vmrgew_v4si",	CONST,	p8_vmrgew_v4si)
 BU_P8V_AV_2 (VMRGOW,		"vmrgow",	CONST,	p8_vmrgow)
 BU_P8V_AV_2 (VBPERMQ,		"vbpermq",	CONST,	altivec_vbpermq)
 BU_P8V_AV_2 (VBPERMQ2,		"vbpermq2",	CONST,	altivec_vbpermq2)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index f1e8d3d..19f6d9c 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1538,6 +1538,28 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { VSX_BUILTIN_VEC_DOUBLEL, VSX_BUILTIN_DOUBLEL_V4SF,
     RS6000_BTI_V2DF, RS6000_BTI_V4SF, 0, 0 },
 
+  { VSX_BUILTIN_VEC_FLOAT, VSX_BUILTIN_XVCVSXWSP_V4SF,
+    RS6000_BTI_V4SF, RS6000_BTI_V4SI, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOAT, VSX_BUILTIN_XVCVUXWSP_V4SF,
+    RS6000_BTI_V4SF, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOAT2, VSX_BUILTIN_FLOAT2_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
+  { VSX_BUILTIN_VEC_FLOAT2, VSX_BUILTIN_UNS_FLOAT2_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI,
+    RS6000_BTI_unsigned_V2DI, 0 },
+  { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_FLOATE_V2DF,
+    RS6000_BTI_V4SF, RS6000_BTI_V2DF, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_FLOATE_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_V2DI, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOATE, VSX_BUILTIN_UNS_FLOATE_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_FLOATO_V2DF,
+    RS6000_BTI_V4SF, RS6000_BTI_V2DF, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_FLOATO_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_V2DI, 0, 0 },
+  { VSX_BUILTIN_VEC_FLOATO, VSX_BUILTIN_UNS_FLOATO_V2DI,
+    RS6000_BTI_V4SF, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
   { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DF,
     RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
   { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DI,
@@ -5262,12 +5284,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
     RS6000_BTI_unsigned_V2DI, 0 },
 
-  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
-  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI,
     RS6000_BTI_unsigned_V4SI, 0 },
-  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW,
+  { P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VMRGEW_V4SI,
     RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 },
 
   { P8V_BUILTIN_VEC_VMRGOW, P8V_BUILTIN_VMRGOW,
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 8a231f5..8165d04 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -72,6 +72,7 @@ extern void altivec_expand_stvex_be (rtx, rtx, machine_mode, unsigned);
 extern void rs6000_expand_extract_even (rtx, rtx, rtx);
 extern void rs6000_expand_interleave (rtx, rtx, rtx, bool);
 extern void rs6000_scale_v2df (rtx, rtx, int);
+extern void rs6000_generate_float2_code (bool, rtx, rtx, rtx);
 extern int expand_block_clear (rtx[]);
 extern int expand_block_move (rtx[]);
 extern bool expand_block_compare (rtx[]);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 941c0c2..3e7ff03 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -36798,7 +36798,7 @@ altivec_expand_vec_perm_const (rtx operands[4])
       (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct
        : CODE_FOR_altivec_vmrghw_direct),
       {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
-    { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
+    { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew_v4si,
       {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
     { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgow,
       {  4,  5,  6,  7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31 } }
@@ -42389,6 +42389,49 @@ rs6000_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
   *update = build2 (COMPOUND_EXPR, void_type_node, update_mffs, update_mtfsf);
 }
 
+void
+rs6000_generate_float2_code (bool signed_convert, rtx dst, rtx src1, rtx src2)
+{
+  rtx rtx_tmp0, rtx_tmp1, rtx_tmp2, rtx_tmp3;
+
+  rtx_tmp0 = gen_reg_rtx (V2DImode);
+  rtx_tmp1 = gen_reg_rtx (V2DImode);
+
+  /* The destination of the vmrgew instruction layout is:
+     rtx_tmp2[0] rtx_tmp3[0] rtx_tmp2[1] rtx_tmp3[0].
+     Setup rtx_tmp0 and rtx_tmp1 to ensure the order of the elements after the
+     vmrgew instruction will be correct.  */
+  if (VECTOR_ELT_ORDER_BIG)
+    {
+      emit_insn (gen_vsx_xxpermdi_v2di_be (rtx_tmp0, src1, src2, GEN_INT(0)));
+      emit_insn (gen_vsx_xxpermdi_v2di_be (rtx_tmp1, src1, src2, GEN_INT(3)));
+    }
+  else
+    {
+      emit_insn (gen_vsx_xxpermdi_v2di (rtx_tmp0, src1, src2, GEN_INT(3)));
+      emit_insn (gen_vsx_xxpermdi_v2di (rtx_tmp1, src1, src2, GEN_INT(0)));
+    }
+
+  rtx_tmp2 = gen_reg_rtx (V4SFmode);
+  rtx_tmp3 = gen_reg_rtx (V4SFmode);
+
+  if (signed_convert)
+    {
+      emit_insn (gen_vsx_xvcvsxdsp (rtx_tmp2, rtx_tmp0));
+      emit_insn (gen_vsx_xvcvsxdsp (rtx_tmp3, rtx_tmp1));
+    }
+  else
+    {
+       emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp2, rtx_tmp0));
+       emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp3, rtx_tmp1));
+    }
+
+  if (VECTOR_ELT_ORDER_BIG)
+    emit_insn (gen_p8_vmrgew_v4sf (dst, rtx_tmp2, rtx_tmp3));
+  else
+    emit_insn (gen_p8_vmrgew_v4sf (dst, rtx_tmp3, rtx_tmp2));
+}
+
 /* Implement the TARGET_OPTAB_SUPPORTED_P hook.  */
 
 static bool
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 141aa42..dd88305 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -310,6 +310,9 @@
 ;; Iterator for the 2 short vector types to do a splat from an integer
 (define_mode_iterator VSX_SPLAT_I [V16QI V8HI])
 
+;; Mode attribute for vector floate and floato conversions
+(define_mode_attr VFC_inst [(V2DI "sxd") (V2DF "dp")])
+
 ;; Mode attribute to give the count for the splat instruction to splat
 ;; the value in the 64-bit integer slot
 (define_mode_attr VSX_SPLAT_COUNT [(V16QI "7") (V8HI "3")])
@@ -331,6 +334,14 @@
    UNSPEC_VSX_CVUXDSP
    UNSPEC_VSX_CVSPSXDS
    UNSPEC_VSX_CVSPUXDS
+   UNSPEC_VSX_CVSXWSP
+   UNSPEC_VSX_CVUXWSP
+   UNSPEC_VSX_FLOAT2
+   UNSPEC_VSX_UNS_FLOAT2
+   UNSPEC_VSX_FLOATE
+   UNSPEC_VSX_UNS_FLOATE
+   UNSPEC_VSX_FLOATO
+   UNSPEC_VSX_UNS_FLOATO
    UNSPEC_VSX_TDIV
    UNSPEC_VSX_TSQRT
    UNSPEC_VSX_SET
@@ -1976,6 +1987,170 @@
   "xvcvspuxds %x0,%x1"
   [(set_attr "type" "vecdouble")])
 
+(define_insn "vsx_xvcvsxwsp"
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
+	(unspec:V4SF[(match_operand:V4SI 1 "vsx_register_operand" "wa")]
+		    UNSPEC_VSX_CVSXWSP))]
+  "VECTOR_UNIT_VSX_P (V4SFmode)"
+  "xvcvsxwsp %x0,%x1"
+  [(set_attr "type" "vecdouble")])
+
+(define_insn "vsx_xvcvuxwsp"
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
+	(unspec:V4SF[(match_operand:V4SI 1 "vsx_register_operand" "wa")]
+		    UNSPEC_VSX_CVUXWSP))]
+  "VECTOR_UNIT_VSX_P (V4SFmode)"
+  "xvcvuxwsp %x0,%x1"
+  [(set_attr "type" "vecdouble")])
+
+;; Generate float2
+;; convert two long long signed ints to float
+(define_expand "float2_v2di"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SI [(match_operand:V2DI 1 "register_operand" "wa")
+		 (match_operand:V2DI 2 "register_operand" "wa")]
+  UNSPEC_VSX_FLOAT2)]
+
+  "TARGET_VSX"
+{
+  rtx rtx_src1, rtx_src2, rtx_dst;
+
+  rtx_dst = operands[0];
+  rtx_src1 = operands[1];
+  rtx_src2 = operands[2];
+
+  rs6000_generate_float2_code (true, rtx_dst, rtx_src1, rtx_src2);
+  DONE;
+})
+
+;; Generate uns_float2
+;; convert two long long unsigned ints to float
+(define_expand "uns_float2_v2di"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SI [(match_operand:V2DI 1 "register_operand" "wa")
+                 (match_operand:V2DI 2 "register_operand" "wa")]
+  UNSPEC_VSX_UNS_FLOAT2)]
+
+  "TARGET_VSX"
+{
+  rtx rtx_src1, rtx_src2, rtx_dst;
+
+  rtx_dst = operands[0];
+  rtx_src1 = operands[1];
+  rtx_src2 = operands[2];
+
+  rs6000_generate_float2_code (true, rtx_dst, rtx_src1, rtx_src2);
+  DONE;
+})
+
+;; Generate floate
+;; convert  double or long long signed to float
+;;(Only even words are valid, BE numbering)
+(define_expand "floate<mode>"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF [(match_operand:VSX_D 1 "register_operand" "wa")]
+   UNSPEC_VSX_FLOATE)]
+   "TARGET_VSX"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    {
+      /* Shift left one word to put even word correct location */
+      rtx rtx_tmp;
+      rtx rtx_val = GEN_INT (4);
+
+      rtx_tmp = gen_reg_rtx (V4SFmode);
+      emit_insn (gen_vsx_xvcv<VFC_inst>sp (rtx_tmp, operands[1]));
+      emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+		 rtx_tmp, rtx_tmp, rtx_val));
+    }
+  else
+    emit_insn (gen_vsx_xvcv<VFC_inst>sp (operands[0], operands[1]));
+
+  DONE;
+})
+
+;; Generate uns_floate
+;; convert long long unsigned to float
+;; (Only even words are valid, BE numbering)
+(define_expand "unsfloatev2di"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF [(match_operand:V2DI 1 "register_operand" "wa")]
+   UNSPEC_VSX_UNS_FLOATE)]
+   "TARGET_VSX"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    {
+      /* Shift left one word to put even word correct location */
+      rtx rtx_tmp;
+      rtx rtx_val = GEN_INT (4);
+
+      rtx_tmp = gen_reg_rtx (V4SFmode);
+      emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp, operands[1]));
+      emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+                 rtx_tmp, rtx_tmp, rtx_val));
+    }
+  else
+    {
+      emit_insn (gen_vsx_xvcvuxdsp (operands[0], operands[1]));
+    }
+  DONE;
+})
+
+;; Generate floato
+;; convert double or long long signed to float
+;; Only odd words are valid, BE numbering)
+(define_expand "floato<mode>"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF [(match_operand:VSX_D 1 "register_operand" "wa")]
+   UNSPEC_VSX_UNS_FLOATO)]
+  "TARGET_VSX"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    {
+      emit_insn (gen_vsx_xvcv<VFC_inst>sp (operands[0], operands[1]));
+    }
+  else
+    {
+      /* Shift left one word to put odd word correct location */
+      rtx rtx_tmp;
+      rtx rtx_val = GEN_INT (4);
+
+      rtx_tmp = gen_reg_rtx (V4SFmode);
+      emit_insn (gen_vsx_xvcv<VFC_inst>sp (rtx_tmp, operands[1]));
+      emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+                 rtx_tmp, rtx_tmp, rtx_val));
+    }
+  DONE;
+})
+
+;; Generate uns_floato
+;; convert long long unsigned to float
+;; (Only odd words are valid, BE numbering)
+(define_expand "unsfloatov2di"
+  [(match_operand:V4SF 0 "register_operand" "=wa")
+   (unspec:V4SF[(match_operand:V2DI 1 "register_operand" "wa")]
+   UNSPEC_VSX_UNS_FLOATO)]
+
+  "TARGET_VSX"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    {
+      emit_insn (gen_vsx_xvcvuxdsp (operands[0], operands[1]));
+    }
+  else
+    {
+      /* Shift left one word to put odd word correct location */
+      rtx rtx_tmp;
+      rtx rtx_val = GEN_INT (4);
+
+      rtx_tmp = gen_reg_rtx (V4SFmode);
+      emit_insn (gen_vsx_xvcvuxdsp (rtx_tmp, operands[1]));
+      emit_insn (gen_altivec_vsldoi_v4sf (operands[0],
+                 rtx_tmp, rtx_tmp, rtx_val));
+    }
+  DONE;
+})
+
 ;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since
 ;; since the xvrdpiz instruction does not truncate the value if the floating
 ;; point value is < LONG_MIN or > LONG_MAX.
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7d39335..a662aeb 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16039,6 +16039,20 @@ vector float vec_expte (vector float);
 
 vector float vec_floor (vector float);
 
+vector float vec_float (vector signed int);
+vector float vec_float (vector unsigned int);
+
+vector float vec_float2 (vector signed long long, vector signed long long);
+vector float vec_float2 (vector unsigned long long, vector signed long long);
+
+vector float vec_floate (vector double);
+vector float vec_floate (vector signed long long);
+vector float vec_floate (vector unsigned long long);
+
+vector float vec_floato (vector double);
+vector float vec_floato (vector signed long long);
+vector float vec_floato (vector unsigned long long);
+
 vector float vec_ld (int, const vector float *);
 vector float vec_ld (int, const float *);
 vector bool int vec_ld (int, const vector bool int *);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 60ec617..8e09a92 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -5,8 +5,37 @@
 
 #include <altivec.h> // vector
 
+#define ALL  1
+#define EVEN 2
+#define ODD  3
+
 void abort (void);
 
+void test_result_sp(int check, vector float vec_result, vector float vec_expected)
+{
+   int i;
+   for(i = 0; i<4; i++) {
+
+      switch (check) {
+      case ALL:
+         break;
+      case EVEN:
+         if (i%2 == 0)
+            break;
+         else
+            continue;
+      case ODD:
+         if (i%2 != 0)
+            break;
+         else
+            continue;
+      }
+
+      if (vec_result[i] != vec_expected[i])
+         abort();
+   }
+}
+
 void test_result_dp(vector double vec_result, vector double vec_expected)
 {
 	if (vec_result[0] != vec_expected[0])
@@ -21,11 +50,17 @@ int main()
 	int i;
 	vector unsigned int vec_unint;
 	vector signed int vec_int;
+	vector long long int vec_ll_int0, vec_ll_int1;
+	vector long long unsigned int vec_ll_uns_int0, vec_ll_uns_int1;
 	vector float  vec_flt, vec_flt_result, vec_flt_expected;
 	vector double vec_dble0, vec_dble1, vec_dble_result, vec_dble_expected;
 
 	vec_int = (vector signed int){ -1, 3, -5, 1234567 };
+	vec_ll_int0 = (vector long long int){ -12, -12345678901234 };
+	vec_ll_int1 = (vector long long int){ 12, 9876543210 };
 	vec_unint = (vector unsigned int){ 9, 11, 15, 2468013579 };
+	vec_ll_uns_int0 = (vector unsigned long long int){ 102, 9753108642 };
+	vec_ll_uns_int1 = (vector unsigned long long int){ 23, 29 };
 	vec_flt = (vector float){ -21., 3.5, -53., 78. };
 	vec_dble0 = (vector double){ 34.0, 97.0 };
 	vec_dble1 = (vector double){ 214.0, -5.5 };
@@ -81,4 +116,51 @@ int main()
 	vec_dble_result = vec_doubleh (vec_unint);
 	test_result_dp(vec_dble_result, vec_dble_expected);
 
+	vec_dble_expected = (vector double){-21.000000, 3.500000};
+	vec_dble_result = vec_doubleh (vec_flt);
+	test_result_dp(vec_dble_result, vec_dble_expected);
+
+	/* conversion of integer vector to single precision float vector */
+	vec_flt_expected = (vector float){-1.00, 3.00, -5.00, 1234567.00};
+	vec_flt_result = vec_float (vec_int);
+	test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){9.00, 11.00, 15.00, 2468013579.0};
+	vec_flt_result = vec_float (vec_unint);
+	test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+   
+	/* conversion of two double precision vectors to single precision vector */
+	vec_flt_expected = (vector float){-12.00, -12345678901234.00, 12.00, 9876543210.00};
+	vec_flt_result = vec_float2 (vec_ll_int0, vec_ll_int1);
+	test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){102.00, 9753108642.00, 23.00, 29.00};
+	vec_flt_result = vec_float2 (vec_ll_uns_int0, vec_ll_uns_int1);
+	test_result_sp(ALL, vec_flt_result, vec_flt_expected);
+
+	/* conversion of even words in double precision vector to single precision vector */
+	vec_flt_expected = (vector float){-12.00, 00.00, -12345678901234.00, 0.00};
+	vec_flt_result = vec_floate (vec_ll_int0);
+	test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){102.00, 0.00, 9753108642.00, 0.00};
+	vec_flt_result = vec_floate (vec_ll_uns_int0);
+	test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){34.00, 0.00, 97.00, 0.00};
+	vec_flt_result = vec_floate (vec_dble0);
+	test_result_sp(EVEN, vec_flt_result, vec_flt_expected);
+
+	/* conversion of odd words in double precision vector to single precision vector */
+	vec_flt_expected = (vector float){0.00, -12.00, 00.00, -12345678901234.00};
+	vec_flt_result = vec_floato (vec_ll_int0);
+	test_result_sp(ODD, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){0.00, 102.00, 0.00, 9753108642.00};
+	vec_flt_result = vec_floato (vec_ll_uns_int0);
+	test_result_sp(ODD, vec_flt_result, vec_flt_expected);
+
+	vec_flt_expected = (vector float){0.00, 34.00, 0.00, 97.00};
+	vec_flt_result = vec_floato (vec_dble0);
+	test_result_sp(ODD, vec_flt_result, vec_flt_expected);
 }
-- 
1.9.1
Follow-Ups:
- Re: [PATCH v2, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
  - From: Segher Boessenkool
- Re: [PATCH v2, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
  - From: Michael Meissner
References:
- [PATCH, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
  - From: Carl E. Love
- Re: [PATCH, rs6000] gcc mainline, add builtin support for vec_float, vec_float2, vec_floate, vec_floate, builtins
  - From: Segher Boessenkool
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]