[PATCH] Optimize nested SIGN_EXTENDs/ZERO_EXTENDs (PR target/45336)

Jakub Jelinek jakub@redhat.com
Fri Aug 20 18:46:00 GMT 2010


On Fri, Aug 20, 2010 at 08:00:33PM +0200, Paolo Bonzini wrote:
> On 08/20/2010 07:27 PM, Jakub Jelinek wrote:
> >Not sure what exactly is
> >pextrb ..., %ecx
> >insn doing to the upper 32 bits of %rcx, if it clears them
> 
> Probably yes like every other 32-bit writeback on x86_64.

The manuals confirm that.
Following seems to work just fine in the quick testing I've done so far:

2010-08-20  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/sse.md (*sse4_1_pextrb): Add SWI48 mode iterator
	to cover zero extension into 64-bit register.
	(*sse2_pextrw): Likewise.
	(*sse4_1_pextrd_zext): New insn.

--- gcc/config/i386/sse.md.jj	2010-08-11 21:08:03.000000000 +0200
+++ gcc/config/i386/sse.md	2010-08-20 20:24:08.000000000 +0200
@@ -7075,14 +7075,14 @@ (define_insn "*sse4_1_pinsrq"
    (set_attr "length_immediate" "1")
    (set_attr "mode" "TI")])
 
-(define_insn "*sse4_1_pextrb"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(zero_extend:SI
+(define_insn "*sse4_1_pextrb_<mode>"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(zero_extend:SWI48
 	  (vec_select:QI
 	    (match_operand:V16QI 1 "register_operand" "x")
 	    (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))]
   "TARGET_SSE4_1"
-  "%vpextrb\t{%2, %1, %0|%0, %1, %2}"
+  "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}"
   [(set_attr "type" "sselog")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
@@ -7102,14 +7102,14 @@ (define_insn "*sse4_1_pextrb_memory"
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "TI")])
 
-(define_insn "*sse2_pextrw"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(zero_extend:SI
+(define_insn "*sse2_pextrw_<mode>"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(zero_extend:SWI48
 	  (vec_select:HI
 	    (match_operand:V8HI 1 "register_operand" "x")
 	    (parallel [(match_operand:SI 2 "const_0_to_7_operand" "n")]))))]
   "TARGET_SSE2"
-  "%vpextrw\t{%2, %1, %0|%0, %1, %2}"
+  "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}"
   [(set_attr "type" "sselog")
    (set_attr "prefix_data16" "1")
    (set_attr "length_immediate" "1")
@@ -7142,6 +7142,20 @@ (define_insn "*sse4_1_pextrd"
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "TI")])
 
+(define_insn "*sse4_1_pextrd_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (vec_select:SI
+	    (match_operand:V4SI 1 "register_operand" "x")
+	    (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))))]
+  "TARGET_64BIT && TARGET_SSE4_1"
+  "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "TI")])
+
 ;; It must come before *vec_extractv2di_1_sse since it is preferred.
 (define_insn "*sse4_1_pextrq"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")


	Jakub



More information about the Gcc-patches mailing list