PATCH: PR target/39327: Incorrect addsub/unpck patterns in sse.md

Thu Mar 12 07:38:00 GMT 2009

HJ,

I might be wrong too. These insns are so complicated. 

Take sse3_addsubps as an example, bit reverse of 5 (0101b) 
should be 10 (1010b).

@@ -1115,7 +1115,7 @@
 	    (match_operand:V4DF 1 "register_operand" "x")
 	    (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
 	  (minus:V4DF (match_dup 1) (match_dup 2))
-	  (const_int 5)))]
+	  (const_int 10)))]
   "TARGET_AVX"
   "vaddsubpd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseadd")
@@ -1129,7 +1129,7 @@
 	    (match_operand:V4SF 1 "register_operand" "x")
 	    (match_operand:V4SF 2 "nonimmediate_operand" "xm"))
 	  (minus:V4SF (match_dup 1) (match_dup 2))
-	  (const_int 5)))]
+	  (const_int 10)))]

There might be other places with wrong number. 

Thanks - Joey

-----Original Message-----
From: H.J. Lu [mailto:hjl.tools@gmail.com] 
Sent: Sunday, March 01, 2009 2:40 AM
To: gcc-patches@gcc.gnu.org
Cc: ubizjak@gmail.com; Ye, Joey; Guo, Xuepeng
Subject: PATCH: PR target/39327: Incorrect addsub/unpck patterns in sse.md

Some addsub/unpck patterns have wrong item selectors. This patch
fixes.  OK for trunk if there are no regressions on Linux/ia32 and
Linux/Intel64?

Thanks.


H.J.
----
2009-02-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/39327
	* config/i386/sse.md (avx_addsubv8sf3): Correct item bits.
	(avx_addsubv4df3): Likewise.
	(*avx_addsubv4sf3): Likewise.
	(sse3_addsubv4sf3): Likewise.
	(*avx_addsubv2df3): Likewise.
	(sse3_addsubv2df3): Likewise.
	(avx_unpckhps256): Correct item selectors.
	(avx_unpcklps256): Likewise.
	(avx_unpckhpd256): Likewise.
	(avx_unpcklpd256): Likewise.

--- gcc/config/i386/sse.md.vop	2009-02-05 09:56:53.000000000 -0800
+++ gcc/config/i386/sse.md	2009-02-28 10:08:44.000000000 -0800
@@ -1101,7 +1101,7 @@
 	    (match_operand:V8SF 1 "register_operand" "x")
 	    (match_operand:V8SF 2 "nonimmediate_operand" "xm"))
 	  (minus:V8SF (match_dup 1) (match_dup 2))
-	  (const_int 85)))]
+	  (const_int 66)))]
   "TARGET_AVX"
   "vaddsubps\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseadd")
@@ -1115,7 +1115,7 @@
 	    (match_operand:V4DF 1 "register_operand" "x")
 	    (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
 	  (minus:V4DF (match_dup 1) (match_dup 2))
-	  (const_int 5)))]
+	  (const_int 6)))]
   "TARGET_AVX"
   "vaddsubpd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseadd")
@@ -1129,7 +1129,7 @@
 	    (match_operand:V4SF 1 "register_operand" "x")
 	    (match_operand:V4SF 2 "nonimmediate_operand" "xm"))
 	  (minus:V4SF (match_dup 1) (match_dup 2))
-	  (const_int 5)))]
+	  (const_int 6)))]
   "TARGET_AVX"
   "vaddsubps\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseadd")
@@ -1143,7 +1143,7 @@
 	    (match_operand:V4SF 1 "register_operand" "0")
 	    (match_operand:V4SF 2 "nonimmediate_operand" "xm"))
 	  (minus:V4SF (match_dup 1) (match_dup 2))
-	  (const_int 5)))]
+	  (const_int 6)))]
   "TARGET_SSE3"
   "addsubps\t{%2, %0|%0, %2}"
   [(set_attr "type" "sseadd")
@@ -1157,7 +1157,7 @@
 	    (match_operand:V2DF 1 "register_operand" "x")
 	    (match_operand:V2DF 2 "nonimmediate_operand" "xm"))
 	  (minus:V2DF (match_dup 1) (match_dup 2))
-	  (const_int 1)))]
+	  (const_int 2)))]
   "TARGET_AVX"
   "vaddsubpd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseadd")
@@ -1171,7 +1171,7 @@
 	    (match_operand:V2DF 1 "register_operand" "0")
 	    (match_operand:V2DF 2 "nonimmediate_operand" "xm"))
 	  (minus:V2DF (match_dup 1) (match_dup 2))
-	  (const_int 1)))]
+	  (const_int 2)))]
   "TARGET_SSE3"
   "addsubpd\t{%2, %0|%0, %2}"
   [(set_attr "type" "sseadd")
@@ -3059,10 +3059,10 @@
 	  (vec_concat:V16SF
 	    (match_operand:V8SF 1 "register_operand" "x")
 	    (match_operand:V8SF 2 "nonimmediate_operand" "xm"))
-	  (parallel [(const_int 2) (const_int 6)
-		     (const_int 3) (const_int 7)
-		     (const_int 10) (const_int 14)
-		     (const_int 11) (const_int 15)])))]
+	  (parallel [(const_int 2) (const_int 10)
+		     (const_int 3) (const_int 11)
+		     (const_int 6) (const_int 14)
+		     (const_int 7) (const_int 15)])))]
   "TARGET_AVX"
   "vunpckhps\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog")
@@ -3102,10 +3102,10 @@
 	  (vec_concat:V16SF
 	    (match_operand:V8SF 1 "register_operand" "x")
 	    (match_operand:V8SF 2 "nonimmediate_operand" "xm"))
-	  (parallel [(const_int 0) (const_int 4)
-		     (const_int 1) (const_int 5)
-		     (const_int 8) (const_int 12)
-		     (const_int 9) (const_int 13)])))]
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)])))]
   "TARGET_AVX"
   "vunpcklps\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog")
@@ -3902,7 +3902,7 @@
 	  (vec_concat:V8DF
 	    (match_operand:V4DF 1 "register_operand" "x")
 	    (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
-	  (parallel [(const_int 2) (const_int 6)
+	  (parallel [(const_int 1) (const_int 5)
 		     (const_int 3) (const_int 7)])))]
   "TARGET_AVX"
   "vunpckhpd\t{%2, %1, %0|%0, %1, %2}"
@@ -4023,7 +4023,7 @@
 	    (match_operand:V4DF 1 "register_operand" "x")
 	    (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
 	  (parallel [(const_int 0) (const_int 4)
-		     (const_int 1) (const_int 5)])))]
+		     (const_int 2) (const_int 6)])))]
   "TARGET_AVX"
   "vunpcklpd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog")