This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: Add SSE4.1 support
On Wed, Apr 18, 2007 at 10:56:39PM -0700, H. J. Lu wrote:
> None of those Yx constraints are documented. I will add them in
> a separate patch.
I'd just as soon *not* expose these Y constrants to users.
Except for *maybe* the new xmm0/xmm[^0], none of them make
any sense for user assembly.
> +(define_insn "sse4_1_blendpd"
> + [(set (match_operand:V2DF 0 "register_operand" "=x")
> + (unspec:V2DF [(match_operand:V2DF 1 "register_operand" "0")
> + (match_operand:V2DF 2 "nonimmediate_operand" "xm")
> + (match_operand:SI 3 "const_0_to_255_operand" "n")]
> + UNSPEC_BLEND))]
I don't see why we need UNSPEC_BLEND. This is exactly the
vec_select operation. You should model it as such, so that
the correct blend instruction is used instead of other
instances of vec_select patterns.
> +(define_insn "sse4_1_blendvpd"
> + [(set (match_operand:V2DF 0 "register_operand" "=Yn")
> + (unspec:V2DF [(match_operand:V2DF 1 "register_operand" "0")
> + (match_operand:V2DF 2 "nonimmediate_operand" "Ynm")
> + (match_operand:V2DF 3 "register_operand" "Y0")]
> + UNSPEC_BLENDV))]
This is probably ok, since the "correct" model would be
(vec_select:V2DF
(match_operand:V2DF 1 "register_operand" "0")
(match_operand:V2DF 2 "nonimmediate_operand "Yn")
(unspec [(match_operand:V2DF 3 "register_operand" "Y0")]
UNSPEC_BLENDV))
You should be able to notice a constant value in the expansion of
__builtin_ia32_blendvp[ds] and emit the same thing as blendp[ds].
> +(define_insn "sse4_1_extractps"
> + [(set (match_operand:SI 0 "register_operand" "=rm")
> + (unspec:SI
> + [(vec_select:SF
> + (match_operand:V4SF 1 "register_operand" "x")
> + (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))]
> + UNSPEC_EXTRACTPS))]
> + "TARGET_SSE4_1"
> + "extractps\t{%2, %1, %0|%0, %1, %2}"
> + [(set_attr "type" "sselog")
> + (set_attr "mode" "V4SF")])
Why SImode? You can certainly consider the value stored to
still be in SFmode... Why the unspec at all?
> +(define_insn "sse4_1_extendv4qiv4si2"
> + [(set (match_operand:V4SI 0 "register_operand" "=x")
> + (sign_extend:V4SI
> + (vec_select:V4QI
> + (match_operand:V16QI 1 "nonimmediate_operand" "xm")
> + (parallel [(const_int 0)
> + (const_int 1)
> + (const_int 2)
> + (const_int 3)]))))]
Doesn't the vectorizer already request a specific name for
some of these extension operations?
r~