[PATCH] Optimize nested SIGN_EXTENDs/ZERO_EXTENDs (PR target/45336)
Jakub Jelinek
jakub@redhat.com
Fri Aug 20 18:46:00 GMT 2010
On Fri, Aug 20, 2010 at 08:00:33PM +0200, Paolo Bonzini wrote:
> On 08/20/2010 07:27 PM, Jakub Jelinek wrote:
> >Not sure what exactly is
> >pextrb ..., %ecx
> >insn doing to the upper 32 bits of %rcx, if it clears them
>
> Probably yes like every other 32-bit writeback on x86_64.
The manuals confirm that.
Following seems to work just fine in the quick testing I've done so far:
2010-08-20 Jakub Jelinek <jakub@redhat.com>
* config/i386/sse.md (*sse4_1_pextrb): Add SWI48 mode iterator
to cover zero extension into 64-bit register.
(*sse2_pextrw): Likewise.
(*sse4_1_pextrd_zext): New insn.
--- gcc/config/i386/sse.md.jj 2010-08-11 21:08:03.000000000 +0200
+++ gcc/config/i386/sse.md 2010-08-20 20:24:08.000000000 +0200
@@ -7075,14 +7075,14 @@ (define_insn "*sse4_1_pinsrq"
(set_attr "length_immediate" "1")
(set_attr "mode" "TI")])
-(define_insn "*sse4_1_pextrb"
- [(set (match_operand:SI 0 "register_operand" "=r")
- (zero_extend:SI
+(define_insn "*sse4_1_pextrb_<mode>"
+ [(set (match_operand:SWI48 0 "register_operand" "=r")
+ (zero_extend:SWI48
(vec_select:QI
(match_operand:V16QI 1 "register_operand" "x")
(parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))]
"TARGET_SSE4_1"
- "%vpextrb\t{%2, %1, %0|%0, %1, %2}"
+ "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}"
[(set_attr "type" "sselog")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
@@ -7102,14 +7102,14 @@ (define_insn "*sse4_1_pextrb_memory"
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "TI")])
-(define_insn "*sse2_pextrw"
- [(set (match_operand:SI 0 "register_operand" "=r")
- (zero_extend:SI
+(define_insn "*sse2_pextrw_<mode>"
+ [(set (match_operand:SWI48 0 "register_operand" "=r")
+ (zero_extend:SWI48
(vec_select:HI
(match_operand:V8HI 1 "register_operand" "x")
(parallel [(match_operand:SI 2 "const_0_to_7_operand" "n")]))))]
"TARGET_SSE2"
- "%vpextrw\t{%2, %1, %0|%0, %1, %2}"
+ "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}"
[(set_attr "type" "sselog")
(set_attr "prefix_data16" "1")
(set_attr "length_immediate" "1")
@@ -7142,6 +7142,20 @@ (define_insn "*sse4_1_pextrd"
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "TI")])
+(define_insn "*sse4_1_pextrd_zext"
+ [(set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI
+ (vec_select:SI
+ (match_operand:V4SI 1 "register_operand" "x")
+ (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))))]
+ "TARGET_64BIT && TARGET_SSE4_1"
+ "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}"
+ [(set_attr "type" "sselog")
+ (set_attr "prefix_extra" "1")
+ (set_attr "length_immediate" "1")
+ (set_attr "prefix" "maybe_vex")
+ (set_attr "mode" "TI")])
+
;; It must come before *vec_extractv2di_1_sse since it is preferred.
(define_insn "*sse4_1_pextrq"
[(set (match_operand:DI 0 "nonimmediate_operand" "=rm")
Jakub
More information about the Gcc-patches
mailing list