fix ssse3_pshufbv8qi3 post-reload const pool load
Alexandre Oliva
oliva@adacore.com
Fri Mar 19 22:44:01 GMT 2021
On Mar 19, 2021, Uros Bizjak <ubizjak@gmail.com> wrote:
>> * config/i386/predicates.md (register_or_const_vec_operand):
>> New.
>> * config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for
>> the now *-prefixed insn_and_split, turn the splitter const vec
>> into an input for the insn, making it an ignored immediate for
>> non-split cases, and loaded into the scratch register
>> otherwise.
> Testcase?
Uhh, sorry I failed to mention that.
gcc.target/i386/pr94467-1.c was what I used (with -mcmodel=large) to
duplicate the problem and then confirm the fix in the trunk.
However, I had a total of 15 similar fails within gcc.target/i386 in a
gcc-10 tree configured with -mcmodel=large, all fixed by this patch:
avx-1.c, avx-2.c, avx-vpshufb-1.c, pr82483-1.c, pr94467-1.c,
pr94467-2.c, sse-13.c, sse-14.c, sse-22.c, sse-22a.c, sse-23.c,
sse-24.c, sse-25.c, sse-26.c, and ssse3-pshufb.c.
> please name this "reg_or_const_vector_operand"
'k
> All constraints should be removed from an expander.
'k
>> + operands[3] = gen_int_mode (0xf7f7f7f7, SImode);
> You can use:
> ix86_build_const_vector (V4SImode, true,
> gen_int_mode (0xf7f7f7f7, SImode));
heh, I had it spelled that way in my local tree at some point. I
thought the explicit const_vector in the pattern looked clearer.
Changed, here's what I'm regstrapping. Ok to install if it succeeds?
for gcc/ChangeLog
* config/i386/predicates.md (reg_or_const_vec_operand): New.
* config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for
the now *-prefixed insn_and_split, turn the splitter const vec
into an input for the insn, making it an ignored immediate for
non-split cases, and loaded into the scratch register
otherwise.
---
gcc/config/i386/predicates.md | 6 ++++++
gcc/config/i386/sse.md | 25 ++++++++++++++++++-------
2 files changed, 24 insertions(+), 7 deletions(-)
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index b6dd5e9d3b243..b1df8548af639 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand"
(ior (match_operand 0 "nonimmediate_operand")
(match_code "const_vector")))
+;; Return true when OP is either register operand, or any
+;; CONST_VECTOR.
+(define_predicate "reg_or_const_vector_operand"
+ (ior (match_operand 0 "register_operand")
+ (match_code "const_vector")))
+
;; Return true when OP is nonimmediate or standard SSE constant.
(define_predicate "nonimmediate_or_sse_const_operand"
(ior (match_operand 0 "nonimmediate_operand")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 43e4d57ec6a3d..9d3728d1cb08b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17159,10 +17159,25 @@ (define_insn "<ssse3_avx2>_pshufb<mode>3<mask_name>"
(set_attr "btver2_decode" "vector")
(set_attr "mode" "<sseinsnmode>")])
-(define_insn_and_split "ssse3_pshufbv8qi3"
+(define_expand "ssse3_pshufbv8qi3"
+ [(parallel
+ [(set (match_operand:V8QI 0 "register_operand")
+ (unspec:V8QI [(match_operand:V8QI 1 "register_operand")
+ (match_operand:V8QI 2 "register_mmxmem_operand")
+ (match_dup 3)] UNSPEC_PSHUFB))
+ (clobber (match_scratch:V4SI 4))])]
+ "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+{
+ operands[3] = ix86_build_const_vector (V4SImode, true,
+ gen_int_mode (0xf7f7f7f7, SImode));
+})
+
+(define_insn_and_split "*ssse3_pshufbv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
(unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv")
- (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")]
+ (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")
+ (match_operand:V4SI 4 "reg_or_const_vector_operand"
+ "i,3,3")]
UNSPEC_PSHUFB))
(clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))]
"(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
@@ -17172,8 +17187,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3"
#"
"TARGET_SSSE3 && reload_completed
&& SSE_REGNO_P (REGNO (operands[0]))"
- [(set (match_dup 3) (match_dup 5))
- (set (match_dup 3)
+ [(set (match_dup 3)
(and:V4SI (match_dup 3) (match_dup 2)))
(set (match_dup 0)
(unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))]
@@ -17188,9 +17202,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3"
GET_MODE (operands[2]));
operands[4] = lowpart_subreg (V16QImode, operands[3],
GET_MODE (operands[3]));
- rtx vec_const = ix86_build_const_vector (V4SImode, true,
- gen_int_mode (0xf7f7f7f7, SImode));
- operands[5] = force_const_mem (V4SImode, vec_const);
}
[(set_attr "mmx_isa" "native,sse_noavx,avx")
(set_attr "prefix_extra" "1")
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
More information about the Gcc-patches
mailing list