This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Fri, Sep 07, 2007 at 08:43:12AM +0200, Uros Bizjak wrote: > Hello! > > + (UNSPEC_SSE5_TRUEFALSE 153) > + (UNSPEC_PPERM 154) > + (UNSPEC_PERMPS 155) > + (UNSPEC_PERMPD 156) > + (UNSPEC_PMACSSWW 157) > + (UNSPEC_PMACSWW 158) > + (UNSPEC_PMACSSWD 159) > + (UNSPEC_PMACSWD 160) > + (UNSPEC_PMACSSDD 161) > + (UNSPEC_PMACSDD 162) > + (UNSPEC_PMACSSDQL 163) > > Please do not use unspecs, unless you really can't describe > instruction with existing RTL codes. The problem with uspecs is, that > they hide all of the insn details, so various optimization passes > (combine!) can't do nothing with it. Over the weekend, I rewrote all of the integer multiply/add and horizontal add instructions to use RTL encoding. At the moment, I left the shifts as an UNSPEC because the SSE5 shifts are different enough from the scalar shifts, that I didn't want to perturb the rest of the compiler (SSE5 vector shift is a left shift if the value is positive and right shift if it is negative). We don't have a logical left shift in RTL. I anticipate that I will change this eventually. > There are plenty of examples in sse.md: > > 1. horizontal add: "sse3_haddv4sf3" > 2. permutations: "sse2_pshufd" > 3. conversions: "sse_cvtsi2ss", "sse_cvtss2si" > (you can also define scalar conversions as float and fix patterns in i386.md) > 4. shifts: "ashl<mode>3", "vec_shl_<mode>" > > Also, there is no need to have UNSPECS for diferent modes. One unspec > is enough to describe the instruction in all modes. So, if there is no > other way to describe insn with standart RTL expressions, these two > should be combined: > > + (UNSPEC_PERMPS 155) > + (UNSPEC_PERMPD 156) > > into > > UNSPEC_PERM, and relevant pattern will have SSEMODEF inputs, Done. > as well as all of these (example): > > + (UNSPEC_PROTB 184) > + (UNSPEC_PROTW 185) > + (UNSPEC_PROTD 186) > + (UNSPEC_PROTQ 187) > > into UNSPEC_PROT, where their input operands would be SSEMODEI. Having > all input operands V2DI mode is not acceptable. Recoded as RTL. > + (define_expand "sse5_protd_imm" > + [(set (match_operand:V2DI 0 "register_operand" "") > + (rotate:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "") > + (match_operand:SI 2 "const_0_to_31_operand" "n")))] > + "TARGET_SSE5" > + { > + rtx op0 = gen_rtx_SUBREG (V4SImode, operands[0], 0); > + rtx op1 = gen_rtx_SUBREG (V4SImode, operands[1], 0); > + > + emit_insn (gen_rotlv4si3 (op0, op1, operands[2])); > + DONE; > + }) > > Why new expander that doesn't expand to new instructions? This should > be implemented in SSE5 header. I went back and completely revamped the GNU intrinisics so it uses the proper types, and used casts in the bmmintrin.h functions for the common intrinsics. > + (define_insn "sse5_pmacsdqh" > + [(set (match_operand:V2DI 0 "register_operand" "=x,x,x") > + (unspec:V2DI [(match_operand:V2DI 1 "nonimmediate_operand" "x,x,m") > + (match_operand:V2DI 2 "nonimmediate_operand" ",x,x") > + (match_operand:V2DI 3 "register_operand" "0,0,0")] UNSPEC_PMACSDQH))] > > Ehm? Op2 constraints should be fixed... > > As a general rule, please use macros wherever possible. I have a plan > to reorganize SSE.md as soon as all big changes (like SSE5 ;) get in. > > --- gcc/config/i386/cpuid.h 2007-09-06 13:29:00.166796000 -0400 > *************** > *** 51,56 **** > --- 51,57 ---- > /* %ecx */ > #define bit_LAHF_LM (1 << 0) > #define bit_SSE4a (1 << 6) > + #define bit_SSE5 (1 << 11) > > For now, please leave SSE5 from driver-i386.c. Instead of passing > -msse5 to compile flags from the driver, driver should pass correct > -march= that implements SSE5. It is better to have "-march=whatever" > instead of '-march=amdfam10 -msse5" > > Regarding the tests, there are three important test in the testsuite: > gcc.target/i386/sse-[12,13,14].c. Please update these tests to include > bmmintrin.h instead of ammintrin.h (adding -msse5 instead of -msse4a > to compile flags). These two tests will check _all_ new code for > compilation problems in -O0 and -O2. I've defined the bit_SSE5 but took out the driver stuff for now. This patch is the latest combined patch. If I don't hear any strong objections, I will check this in a few hours, and we can fix any problems in subsequent patches. I did a full boostrap build based on changes as they were in the tree on Saturday. I just rev'ed up to today's changes, fixed the merge conflicts, and did a quick build, and I will do a full boostrap build/test after sending this mail. Thanks for doing the review. -- Michael Meissner, AMD 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA michael.meissner@amd.com
Attachment:
sse5-gcc.patch03
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |