This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH] Add fixuns_trunc<mode><sseintvecmodelower>2

From: Jakub Jelinek <jakub at redhat dot com>
To: Richard Henderson <rth at redhat dot com>, Uros Bizjak <ubizjak at gmail dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Mon, 31 Oct 2011 23:29:21 +0100
Subject: [PATCH] Add fixuns_trunc<mode><sseintvecmodelower>2
Reply-to: Jakub Jelinek <jakub at redhat dot com>

Hi!

This allows to vectorize float -> uint conversion.
To convert V{4,8}SFmode op0 to V{4,8}SImode target, it emits:
  V{4,8}SFmode mask = op0 >= { INT_MAX + 1U + .0f, INT_MAX + 1U + .0f, ... }	// non-signalling GE
  V{4,8}SFmode tmp1 = mask & { 2.0f * INT_MIN, 2.0f * INT_MIN, ... }
  V{4,8}SFmode tmp2 = op0 + tmp1
  V{4,8}SImode target = (V{4,8}SImode) tmp2
TARGET_AVX is needed, because pre-AVX we didn't have non-signalling GE in
cmpps and we don't want to raise exceptions if op0 is QNaN (scalar code uses
vucomiss).

Ok for trunk?

2011-10-31  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/sse.md (fixuns_trunc<mode><sseintvecmodelower>2): New
	expander.

--- gcc/config/i386/sse.md.jj	2011-10-31 21:05:21.000000000 +0100
+++ gcc/config/i386/sse.md	2011-10-31 22:53:13.000000000 +0100
@@ -2322,6 +2322,35 @@ (define_insn "fix_truncv4sfv4si2"
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "TI")])
 
+(define_expand "fixuns_trunc<mode><sseintvecmodelower>2"
+  [(set (match_dup 4)
+	(unspec:VF1
+	  [(match_operand:VF1 1 "register_operand" "")
+	   (match_dup 2)
+	   (const_int 29)] UNSPEC_PCMP))
+   (set (match_dup 5)
+	(and:VF1 (match_dup 4) (match_dup 3)))
+   (set (match_dup 6)
+	(plus:VF1 (match_dup 1) (match_dup 5)))
+   (set (match_operand:<sseintvecmode> 0 "register_operand" "")
+	(fix:<sseintvecmode> (match_dup 6)))]
+  "TARGET_AVX"
+{
+  REAL_VALUE_TYPE MTWO32r, TWO31r;
+  int i;
+
+  real_ldexp (&TWO31r, &dconst1, 31);
+  operands[2] = const_double_from_real_value (TWO31r, SFmode);
+  operands[2] = ix86_build_const_vector (<MODE>mode, 1, operands[2]);
+  operands[2] = force_reg (<MODE>mode, operands[2]);
+  real_ldexp (&MTWO32r, &dconstm1, 32);
+  operands[3] = const_double_from_real_value (MTWO32r, SFmode);
+  operands[3] = ix86_build_const_vector (<MODE>mode, 1, operands[3]);
+  operands[3] = force_reg (<MODE>mode, operands[3]);
+  for (i = 4; i < 7; i++)
+    operands[i] = gen_reg_rtx (<MODE>mode);
+})
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
 ;; Parallel double-precision floating point conversion operations

	Jakub

Follow-Ups:
- Re: [PATCH] Add fixuns_trunc<mode><sseintvecmodelower>2
  - From: Richard Henderson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]