This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Defining a keyword (something like) vect_extract_large_types for x86_64This implements some missing vec_interleave and vec_extract_{odd,even} expanders for x86_64 to make vectorizing complex multiplication possible.
The problem starts with the testsuite which either can be tuned to
have vect_extract_even_odd or not, but not, as x86_64 requires,
only turn on support for vect_extract_even_odd for SImode or larger
element sizes. Any suggestions how to deal with this?
(and all the targets that have support for all the types) won't help?
The tests will have to be changed as well, of course, for example
Uros suggested something similar. So I added vect_extract_even_odd_wide and vect_strided_wide that only cover vector elements of 4 byte size or larger.
The following is what I am now re-testing on x86_64/i686.
Ok for trunk?
Thanks, Richard.
2008-07-31 Richard Guenther <rguenther@suse.de>
PR target/35252
* config/i386/sse.md (SSEMODE4S, SSEMODE2D): New mode iterators.
(ssedoublesizemode): New mode attribute.
(sse_shufps): Call gen_sse_shufps_v4sf.
(sse_shufps_1): Macroize.
(sse2_shufpd): Call gen_Sse_shufpd_v2df.
(sse2_shufpd_1): Macroize.
(vec_extract_odd, vec_extract_even): New expanders.
(vec_interleave_highv4sf, vec_interleave_lowv4sf,
vec_interleave_highv2df, vec_interleave_lowv2df): Likewise.
* i386.c (ix86_expand_vector_init_one_nonzero): Call
gen_sse_shufps_v4sf instead of gen_sse_shufps_1.
(ix86_expand_vector_set): Likewise.
(ix86_expand_reduc_v4sf): Likewise.
* lib/target-supports.exp (vect_extract_even_odd_wide) Add. (vect_strided_wide): Likewise. * gcc.dg/vect/fast-math-pr35982.c: Enable for vect_extract_even_odd_wide. * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise. * gcc.dg/vect/vect-1.c: Likewise. * gcc.dg/vect/vect-107.c: Likewise. * gcc.dg/vect/vect-98.c: Likewise. * gcc.dg/vect/vect-strided-float.c: Likewise. * gcc.dg/vect/slp-11.c: Enable for vect_strided_wide. * gcc.dg/vect/slp-12a.c: Likewise. * gcc.dg/vect/slp-12b.c: Likewise. * gcc.dg/vect/slp-19.c: Likewise. * gcc.dg/vect/slp-23.c: Likewise. * gcc.dg/vect/slp-5.c: Likewise.
Index: gcc/config/i386/sse.md
===================================================================
*** gcc/config/i386/sse.md.orig 2008-07-30 17:08:05.000000000 +0200
--- gcc/config/i386/sse.md 2008-07-31 12:25:36.000000000 +0200
***************
*** 36,41 ****
--- 36,45 ----
(define_mode_iterator SSEMODEF4 [SF DF V4SF V2DF])
(define_mode_iterator SSEMODEF2P [V4SF V2DF])
+ ;; Int-float size matches
+ (define_mode_iterator SSEMODE4S [V4SF V4SI])
+ (define_mode_iterator SSEMODE2D [V2DF V2DI])
+ ;; Mapping from float mode to required SSE level
(define_mode_attr sse [(SF "sse") (DF "sse2") (V4SF "sse") (V2DF "sse2")])
***************
*** 57,62 ****
--- 61,70 ----
(V16QI "QI") (V8HI "HI")
(V4SI "SI") (V2DI "DI")])
+ ;; Mapping of vector modes to a vector mode of double size
+ (define_mode_attr ssedoublesizemode [(V2DF "V4DF") (V2DI "V4DI")
+ (V4SF "V8SF") (V4SI "V8SI")])
+ ;; Number of scalar elements in each vector type
(define_mode_attr ssescalarnum [(V4SF "4") (V2DF "2")
(V16QI "16") (V8HI "8")
***************
*** 2129,2135 ****
"TARGET_SSE"
{
int mask = INTVAL (operands[3]);
! emit_insn (gen_sse_shufps_1 (operands[0], operands[1], operands[2],
GEN_INT ((mask >> 0) & 3),
GEN_INT ((mask >> 2) & 3),
GEN_INT (((mask >> 4) & 3) + 4),
--- 2137,2143 ----
"TARGET_SSE"
{
int mask = INTVAL (operands[3]);
! emit_insn (gen_sse_shufps_v4sf (operands[0], operands[1], operands[2],
GEN_INT ((mask >> 0) & 3),
GEN_INT ((mask >> 2) & 3),
GEN_INT (((mask >> 4) & 3) + 4),
***************
*** 2137,2148 ****
DONE;
})
! (define_insn "sse_shufps_1"
! [(set (match_operand:V4SF 0 "register_operand" "=x")
! (vec_select:V4SF
! (vec_concat:V8SF
! (match_operand:V4SF 1 "register_operand" "0")
! (match_operand:V4SF 2 "nonimmediate_operand" "xm"))
(parallel [(match_operand 3 "const_0_to_3_operand" "")
(match_operand 4 "const_0_to_3_operand" "")
(match_operand 5 "const_4_to_7_operand" "")
--- 2145,2156 ----
DONE;
})
! (define_insn "sse_shufps_<mode>"
! [(set (match_operand:SSEMODE4S 0 "register_operand" "=x")
! (vec_select:SSEMODE4S
! (vec_concat:<ssedoublesizemode>
! (match_operand:SSEMODE4S 1 "register_operand" "0")
! (match_operand:SSEMODE4S 2 "nonimmediate_operand" "xm"))
(parallel [(match_operand 3 "const_0_to_3_operand" "")
(match_operand 4 "const_0_to_3_operand" "")
(match_operand 5 "const_4_to_7_operand" "")
***************
*** 2540,2557 ****
"TARGET_SSE2"
{
int mask = INTVAL (operands[3]);
! emit_insn (gen_sse2_shufpd_1 (operands[0], operands[1], operands[2],
GEN_INT (mask & 1),
GEN_INT (mask & 2 ? 3 : 2)));
DONE;
})
! (define_insn "sse2_shufpd_1"
! [(set (match_operand:V2DF 0 "register_operand" "=x")
! (vec_select:V2DF
! (vec_concat:V4DF
! (match_operand:V2DF 1 "register_operand" "0")
! (match_operand:V2DF 2 "nonimmediate_operand" "xm"))
(parallel [(match_operand 3 "const_0_to_1_operand" "")
(match_operand 4 "const_2_to_3_operand" "")])))]
"TARGET_SSE2"
--- 2548,2611 ----
"TARGET_SSE2"
{
int mask = INTVAL (operands[3]);
! emit_insn (gen_sse2_shufpd_v2df (operands[0], operands[1], operands[2],
GEN_INT (mask & 1),
GEN_INT (mask & 2 ? 3 : 2)));
DONE;
})
! (define_expand "vec_extract_even<mode>"
! [(match_operand:SSEMODE4S 0 "register_operand" "")
! (match_operand:SSEMODE4S 1 "register_operand" "")
! (match_operand:SSEMODE4S 2 "nonimmediate_operand" "")]
! "TARGET_SSE"
! {
! emit_insn (gen_sse_shufps_<mode> (operands[0], operands[1], operands[2],
! GEN_INT (0), GEN_INT (2),
! GEN_INT (4), GEN_INT (6)));
! DONE;
! })
! ! (define_expand "vec_extract_odd<mode>"
! [(match_operand:SSEMODE4S 0 "register_operand" "")
! (match_operand:SSEMODE4S 1 "register_operand" "")
! (match_operand:SSEMODE4S 2 "nonimmediate_operand" "")]
! "TARGET_SSE"
! {
! emit_insn (gen_sse_shufps_<mode> (operands[0], operands[1], operands[2],
! GEN_INT (1), GEN_INT (3),
! GEN_INT (5), GEN_INT (7)));
! DONE;
! })
! ! (define_expand "vec_extract_even<mode>"
! [(match_operand:SSEMODE2D 0 "register_operand" "")
! (match_operand:SSEMODE2D 1 "register_operand" "")
! (match_operand:SSEMODE2D 2 "nonimmediate_operand" "")]
! "TARGET_SSE2"
! {
! emit_insn (gen_sse2_shufpd_<mode> (operands[0], operands[1], operands[2],
! GEN_INT (0), GEN_INT (2)));
! DONE;
! })
! ! (define_expand "vec_extract_odd<mode>"
! [(match_operand:SSEMODE2D 0 "register_operand" "")
! (match_operand:SSEMODE2D 1 "register_operand" "")
! (match_operand:SSEMODE2D 2 "nonimmediate_operand" "")]
! "TARGET_SSE2"
! {
! emit_insn (gen_sse2_shufpd_<mode> (operands[0], operands[1], operands[2],
! GEN_INT (1), GEN_INT (3)));
! DONE;
! })
! ! (define_insn "sse2_shufpd_<mode>"
! [(set (match_operand:SSEMODE2D 0 "register_operand" "=x")
! (vec_select:SSEMODE2D
! (vec_concat:<ssedoublesizemode>
! (match_operand:SSEMODE2D 1 "register_operand" "0")
! (match_operand:SSEMODE2D 2 "nonimmediate_operand" "xm"))
(parallel [(match_operand 3 "const_0_to_1_operand" "")
(match_operand 4 "const_2_to_3_operand" "")])))]
"TARGET_SSE2"
***************
*** 4195,4200 ****
--- 4249,4310 ----
DONE;
})
+ (define_expand "vec_interleave_highv4sf"
+ [(set (match_operand:V4SF 0 "register_operand" "")
+ (vec_select:V4SF
+ (vec_concat:V8SF
+ (match_operand:V4SF 1 "register_operand" "")
+ (match_operand:V4SF 2 "nonimmediate_operand" ""))
+ (parallel [(const_int 1)
+ (const_int 3)])))]
+ "TARGET_SSE"
+ {
+ emit_insn (gen_sse_unpckhps (operands[0], operands[1], operands[2]));
+ DONE;
+ })
+ + (define_expand "vec_interleave_lowv4sf"
+ [(set (match_operand:V4SF 0 "register_operand" "")
+ (vec_select:V4SF
+ (vec_concat:V8SF
+ (match_operand:V4SF 1 "register_operand" "")
+ (match_operand:V4SF 2 "nonimmediate_operand" ""))
+ (parallel [(const_int 1)
+ (const_int 3)])))]
+ "TARGET_SSE"
+ {
+ emit_insn (gen_sse_unpcklps (operands[0], operands[1], operands[2]));
+ DONE;
+ })
+ + (define_expand "vec_interleave_highv2df"
+ [(set (match_operand:V2DF 0 "register_operand" "")
+ (vec_select:V2DF
+ (vec_concat:V4DF
+ (match_operand:V2DF 1 "register_operand" "")
+ (match_operand:V2DF 2 "nonimmediate_operand" ""))
+ (parallel [(const_int 1)
+ (const_int 3)])))]
+ "TARGET_SSE2"
+ {
+ emit_insn (gen_sse2_unpckhpd (operands[0], operands[1], operands[2]));
+ DONE;
+ })
+ + (define_expand "vec_interleave_lowv2df"
+ [(set (match_operand:V2DF 0 "register_operand" "")
+ (vec_select:V2DF
+ (vec_concat:V4DF
+ (match_operand:V2DF 1 "register_operand" "")
+ (match_operand:V2DF 2 "nonimmediate_operand" ""))
+ (parallel [(const_int 0)
+ (const_int 2)])))]
+ "TARGET_SSE2"
+ {
+ emit_insn (gen_sse2_unpcklpd (operands[0], operands[1], operands[2]));
+ DONE;
+ })
Thanks, Uros.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |