SSE5 patches part 2
Uros Bizjak
ubizjak@gmail.com
Fri Sep 7 06:59:00 GMT 2007
Hello!
+ (UNSPEC_SSE5_TRUEFALSE 153)
+ (UNSPEC_PPERM 154)
+ (UNSPEC_PERMPS 155)
+ (UNSPEC_PERMPD 156)
+ (UNSPEC_PMACSSWW 157)
+ (UNSPEC_PMACSWW 158)
+ (UNSPEC_PMACSSWD 159)
+ (UNSPEC_PMACSWD 160)
+ (UNSPEC_PMACSSDD 161)
+ (UNSPEC_PMACSDD 162)
+ (UNSPEC_PMACSSDQL 163)
Please do not use unspecs, unless you really can't describe
instruction with existing RTL codes. The problem with uspecs is, that
they hide all of the insn details, so various optimization passes
(combine!) can't do nothing with it.
There are plenty of examples in sse.md:
1. horizontal add: "sse3_haddv4sf3"
2. permutations: "sse2_pshufd"
3. conversions: "sse_cvtsi2ss", "sse_cvtss2si"
(you can also define scalar conversions as float and fix patterns in i386.md)
4. shifts: "ashl<mode>3", "vec_shl_<mode>"
Also, there is no need to have UNSPECS for diferent modes. One unspec
is enough to describe the instruction in all modes. So, if there is no
other way to describe insn with standart RTL expressions, these two
should be combined:
+ (UNSPEC_PERMPS 155)
+ (UNSPEC_PERMPD 156)
into
UNSPEC_PERM, and relevant pattern will have SSEMODEF inputs,
as well as all of these (example):
+ (UNSPEC_PROTB 184)
+ (UNSPEC_PROTW 185)
+ (UNSPEC_PROTD 186)
+ (UNSPEC_PROTQ 187)
into UNSPEC_PROT, where their input operands would be SSEMODEI. Having
all input operands V2DI mode is not acceptable.
+ (define_expand "sse5_protd_imm"
+ [(set (match_operand:V2DI 0 "register_operand" "")
+ (rotate:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "")
+ (match_operand:SI 2 "const_0_to_31_operand" "n")))]
+ "TARGET_SSE5"
+ {
+ rtx op0 = gen_rtx_SUBREG (V4SImode, operands[0], 0);
+ rtx op1 = gen_rtx_SUBREG (V4SImode, operands[1], 0);
+
+ emit_insn (gen_rotlv4si3 (op0, op1, operands[2]));
+ DONE;
+ })
Why new expander that doesn't expand to new instructions? This should
be implemented in SSE5 header.
+ (define_insn "sse5_pmacsdqh"
+ [(set (match_operand:V2DI 0 "register_operand" "=x,x,x")
+ (unspec:V2DI [(match_operand:V2DI 1 "nonimmediate_operand" "x,x,m")
+ (match_operand:V2DI 2 "nonimmediate_operand" ",x,x")
+ (match_operand:V2DI 3 "register_operand" "0,0,0")] UNSPEC_PMACSDQH))]
Ehm? Op2 constraints should be fixed...
As a general rule, please use macros wherever possible. I have a plan
to reorganize SSE.md as soon as all big changes (like SSE5 ;) get in.
--- gcc/config/i386/cpuid.h 2007-09-06 13:29:00.166796000 -0400
***************
*** 51,56 ****
--- 51,57 ----
/* %ecx */
#define bit_LAHF_LM (1 << 0)
#define bit_SSE4a (1 << 6)
+ #define bit_SSE5 (1 << 11)
For now, please leave SSE5 from driver-i386.c. Instead of passing
-msse5 to compile flags from the driver, driver should pass correct
-march= that implements SSE5. It is better to have "-march=whatever"
instead of '-march=amdfam10 -msse5"
Regarding the tests, there are three important test in the testsuite:
gcc.target/i386/sse-[12,13,14].c. Please update these tests to include
bmmintrin.h instead of ammintrin.h (adding -msse5 instead of -msse4a
to compile flags). These two tests will check _all_ new code for
compilation problems in -O0 and -O2.
Uros.
More information about the Gcc-patches
mailing list