This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Add SSE4.1 support


On Wed, Apr 18, 2007 at 10:56:39PM -0700, H. J. Lu wrote:
> None of those Yx constraints are documented. I will add them in
> a separate patch.

I'd just as soon *not* expose these Y constrants to users.
Except for *maybe* the new xmm0/xmm[^0], none of them make
any sense for user assembly.

> +(define_insn "sse4_1_blendpd"
> +  [(set (match_operand:V2DF 0 "register_operand" "=x")
> +        (unspec:V2DF [(match_operand:V2DF 1 "register_operand" "0")
> +        	      (match_operand:V2DF 2 "nonimmediate_operand" "xm")
> +                      (match_operand:SI   3 "const_0_to_255_operand" "n")]
> +                     UNSPEC_BLEND))]

I don't see why we need UNSPEC_BLEND.  This is exactly the
vec_select operation.  You should model it as such, so that
the correct blend instruction is used instead of other 
instances of vec_select patterns.

> +(define_insn "sse4_1_blendvpd"
> +  [(set (match_operand:V2DF 0 "register_operand" "=Yn")
> +        (unspec:V2DF [(match_operand:V2DF 1 "register_operand"  "0")
> +        	      (match_operand:V2DF 2 "nonimmediate_operand" "Ynm")
> +                      (match_operand:V2DF 3 "register_operand" "Y0")]
> +                     UNSPEC_BLENDV))]

This is probably ok, since the "correct" model would be

	(vec_select:V2DF
	  (match_operand:V2DF 1 "register_operand" "0")
	  (match_operand:V2DF 2 "nonimmediate_operand "Yn")
	  (unspec [(match_operand:V2DF 3 "register_operand" "Y0")]
		  UNSPEC_BLENDV))

You should be able to notice a constant value in the expansion of
__builtin_ia32_blendvp[ds] and emit the same thing as blendp[ds].

> +(define_insn "sse4_1_extractps"
> +  [(set (match_operand:SI 0 "register_operand" "=rm")
> +	(unspec:SI
> +	  [(vec_select:SF
> +	     (match_operand:V4SF 1 "register_operand" "x")
> +	     (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))]
> +	  UNSPEC_EXTRACTPS))]
> +  "TARGET_SSE4_1"
> +  "extractps\t{%2, %1, %0|%0, %1, %2}"
> +  [(set_attr "type" "sselog")
> +   (set_attr "mode" "V4SF")])

Why SImode?  You can certainly consider the value stored to 
still be in SFmode...    Why the unspec at all?

> +(define_insn "sse4_1_extendv4qiv4si2"
> +  [(set (match_operand:V4SI 0 "register_operand" "=x")
> +	(sign_extend:V4SI
> +	   (vec_select:V4QI
> +	      (match_operand:V16QI 1 "nonimmediate_operand" "xm")
> +	      (parallel [(const_int 0)
> +			 (const_int 1)
> +			 (const_int 2)
> +			 (const_int 3)]))))]

Doesn't the vectorizer already request a specific name for 
some of these extension operations?



r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]