This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: SSE5 patches round 3


On Fri, Sep 07, 2007 at 08:43:12AM +0200, Uros Bizjak wrote:
> Hello!
> 
> +    (UNSPEC_SSE5_TRUEFALSE	153)
> +    (UNSPEC_PPERM		154)
> +    (UNSPEC_PERMPS		155)
> +    (UNSPEC_PERMPD		156)
> +    (UNSPEC_PMACSSWW		157)
> +    (UNSPEC_PMACSWW		158)
> +    (UNSPEC_PMACSSWD		159)
> +    (UNSPEC_PMACSWD		160)
> +    (UNSPEC_PMACSSDD		161)
> +    (UNSPEC_PMACSDD		162)
> +    (UNSPEC_PMACSSDQL		163)
> 
> Please do not use unspecs, unless you really can't describe
> instruction with existing RTL codes. The problem with uspecs is, that
> they hide all of the insn details, so various optimization passes
> (combine!) can't do nothing with it.

Over the weekend, I rewrote all of the integer multiply/add and horizontal add
instructions to use RTL encoding.  At the moment, I left the shifts as an
UNSPEC because the SSE5 shifts are different enough from the scalar shifts,
that I didn't want to perturb the rest of the compiler (SSE5 vector shift is a
left shift if the value is positive and right shift if it is negative).  We
don't have a logical left shift in RTL.  I anticipate that I will change this
eventually.

> There are plenty of examples in sse.md:
> 
> 1. horizontal add: "sse3_haddv4sf3"
> 2. permutations: "sse2_pshufd"
> 3. conversions: "sse_cvtsi2ss", "sse_cvtss2si"
> (you can also define scalar conversions as float and fix patterns in i386.md)
> 4. shifts: "ashl<mode>3", "vec_shl_<mode>"
> 
> Also, there is no need to have UNSPECS for diferent modes. One unspec
> is enough to describe the instruction in all modes. So, if there is no
> other way to describe insn with standart RTL expressions, these two
> should be combined:
> 
> +    (UNSPEC_PERMPS		155)
> +    (UNSPEC_PERMPD		156)
> 
> into
> 
> UNSPEC_PERM, and relevant pattern will have SSEMODEF inputs,

Done.

> as well as all of these (example):
> 
> +    (UNSPEC_PROTB		184)
> +    (UNSPEC_PROTW		185)
> +    (UNSPEC_PROTD		186)
> +    (UNSPEC_PROTQ		187)
> 
> into UNSPEC_PROT, where their input operands would be SSEMODEI. Having
> all input operands V2DI mode is not acceptable.

Recoded as RTL.

> + (define_expand "sse5_protd_imm"
> +   [(set (match_operand:V2DI 0 "register_operand" "")
> + 	(rotate:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "")
> + 		     (match_operand:SI 2 "const_0_to_31_operand" "n")))]
> +   "TARGET_SSE5"
> + {
> +   rtx op0 = gen_rtx_SUBREG (V4SImode, operands[0], 0);
> +   rtx op1 = gen_rtx_SUBREG (V4SImode, operands[1], 0);
> +
> +   emit_insn (gen_rotlv4si3 (op0, op1, operands[2]));
> +   DONE;
> + })
> 
> Why new expander that doesn't expand to new instructions? This should
> be implemented in SSE5 header.

I went back and completely revamped the GNU intrinisics so it uses the proper
types, and used casts in the bmmintrin.h functions for the common intrinsics.

> + (define_insn "sse5_pmacsdqh"
> +   [(set (match_operand:V2DI 0 "register_operand" "=x,x,x")
> + 	(unspec:V2DI [(match_operand:V2DI 1 "nonimmediate_operand" "x,x,m")
> + 		      (match_operand:V2DI 2 "nonimmediate_operand" ",x,x")
> + 		      (match_operand:V2DI 3 "register_operand" "0,0,0")] UNSPEC_PMACSDQH))]
> 
> Ehm? Op2 constraints should be fixed...
> 
> As a general rule, please use macros wherever possible. I have a plan
> to reorganize SSE.md  as soon as all big changes (like SSE5 ;) get in.
> 
> --- gcc/config/i386/cpuid.h	2007-09-06 13:29:00.166796000 -0400
> ***************
> *** 51,56 ****
> --- 51,57 ----
>   /* %ecx */
>   #define bit_LAHF_LM	(1 << 0)
>   #define bit_SSE4a	(1 << 6)
> + #define bit_SSE5	(1 << 11)
> 
> For now, please leave SSE5 from driver-i386.c. Instead of passing
> -msse5 to compile flags from the driver, driver should pass correct
> -march= that implements SSE5. It is better to have "-march=whatever"
> instead of '-march=amdfam10 -msse5"
> 
> Regarding the tests, there are three important test in the testsuite:
> gcc.target/i386/sse-[12,13,14].c. Please update these tests to include
> bmmintrin.h instead of ammintrin.h (adding -msse5 instead of -msse4a
> to compile flags). These two tests will check _all_ new code for
> compilation problems in -O0 and  -O2.

I've defined the bit_SSE5 but took out the driver stuff for now.

This patch is the latest combined patch.  If I don't hear any strong
objections, I will check this in a few hours, and we can fix any problems in
subsequent patches.  I did a full boostrap build based on changes as they were
in the tree on Saturday.  I just rev'ed up to today's changes, fixed the merge
conflicts, and did a quick build, and I will do a full boostrap build/test
after sending this mail.

Thanks for doing the review.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
michael.meissner@amd.com

Attachment: sse5-gcc.patch03
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]