SSE5 patches part 2

Uros Bizjak ubizjak@gmail.com
Fri Sep 7 06:59:00 GMT 2007


Hello!

+    (UNSPEC_SSE5_TRUEFALSE	153)
+    (UNSPEC_PPERM		154)
+    (UNSPEC_PERMPS		155)
+    (UNSPEC_PERMPD		156)
+    (UNSPEC_PMACSSWW		157)
+    (UNSPEC_PMACSWW		158)
+    (UNSPEC_PMACSSWD		159)
+    (UNSPEC_PMACSWD		160)
+    (UNSPEC_PMACSSDD		161)
+    (UNSPEC_PMACSDD		162)
+    (UNSPEC_PMACSSDQL		163)

Please do not use unspecs, unless you really can't describe
instruction with existing RTL codes. The problem with uspecs is, that
they hide all of the insn details, so various optimization passes
(combine!) can't do nothing with it.

There are plenty of examples in sse.md:

1. horizontal add: "sse3_haddv4sf3"
2. permutations: "sse2_pshufd"
3. conversions: "sse_cvtsi2ss", "sse_cvtss2si"
(you can also define scalar conversions as float and fix patterns in i386.md)
4. shifts: "ashl<mode>3", "vec_shl_<mode>"

Also, there is no need to have UNSPECS for diferent modes. One unspec
is enough to describe the instruction in all modes. So, if there is no
other way to describe insn with standart RTL expressions, these two
should be combined:

+    (UNSPEC_PERMPS		155)
+    (UNSPEC_PERMPD		156)

into

UNSPEC_PERM, and relevant pattern will have SSEMODEF inputs,

as well as all of these (example):

+    (UNSPEC_PROTB		184)
+    (UNSPEC_PROTW		185)
+    (UNSPEC_PROTD		186)
+    (UNSPEC_PROTQ		187)

into UNSPEC_PROT, where their input operands would be SSEMODEI. Having
all input operands V2DI mode is not acceptable.

+ (define_expand "sse5_protd_imm"
+   [(set (match_operand:V2DI 0 "register_operand" "")
+ 	(rotate:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "")
+ 		     (match_operand:SI 2 "const_0_to_31_operand" "n")))]
+   "TARGET_SSE5"
+ {
+   rtx op0 = gen_rtx_SUBREG (V4SImode, operands[0], 0);
+   rtx op1 = gen_rtx_SUBREG (V4SImode, operands[1], 0);
+
+   emit_insn (gen_rotlv4si3 (op0, op1, operands[2]));
+   DONE;
+ })

Why new expander that doesn't expand to new instructions? This should
be implemented in SSE5 header.

+ (define_insn "sse5_pmacsdqh"
+   [(set (match_operand:V2DI 0 "register_operand" "=x,x,x")
+ 	(unspec:V2DI [(match_operand:V2DI 1 "nonimmediate_operand" "x,x,m")
+ 		      (match_operand:V2DI 2 "nonimmediate_operand" ",x,x")
+ 		      (match_operand:V2DI 3 "register_operand" "0,0,0")] UNSPEC_PMACSDQH))]

Ehm? Op2 constraints should be fixed...

As a general rule, please use macros wherever possible. I have a plan
to reorganize SSE.md  as soon as all big changes (like SSE5 ;) get in.

--- gcc/config/i386/cpuid.h	2007-09-06 13:29:00.166796000 -0400
***************
*** 51,56 ****
--- 51,57 ----
  /* %ecx */
  #define bit_LAHF_LM	(1 << 0)
  #define bit_SSE4a	(1 << 6)
+ #define bit_SSE5	(1 << 11)

For now, please leave SSE5 from driver-i386.c. Instead of passing
-msse5 to compile flags from the driver, driver should pass correct
-march= that implements SSE5. It is better to have "-march=whatever"
instead of '-march=amdfam10 -msse5"

Regarding the tests, there are three important test in the testsuite:
gcc.target/i386/sse-[12,13,14].c. Please update these tests to include
bmmintrin.h instead of ammintrin.h (adding -msse5 instead of -msse4a
to compile flags). These two tests will check _all_ new code for
compilation problems in -O0 and  -O2.

Uros.



More information about the Gcc-patches mailing list