This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Add sparc VIS 2.0 builtins, intrinsics, and option to control them.
- From: Richard Henderson <rth at redhat dot com>
- To: David Miller <davem at davemloft dot net>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 03 Oct 2011 09:49:37 -0700
- Subject: Re: [PATCH] Add sparc VIS 2.0 builtins, intrinsics, and option to control them.
- References: <20110930.035938.868908024383419283.davem@davemloft.net> <4E862EB8.7020301@redhat.com> <20111003.012811.2081151756340496982.davem@davemloft.net>
On 10/02/2011 10:28 PM, David Miller wrote:
>> (set (reg:DI GSR_REG)
>> (unspec:DI [(match_dup 1) (match_dup 2) (reg:DI GSR_REG)]
>> UNSPEC_BMASK))
>
> Actually, can't we just use a (zero_extend:DI (plus:SI ...)) for the
> 32-bit case? It seems to work fine.
Sure.
>>> +(define_insn "bshuffle<V64I:mode>_vis"
>>> + [(set (match_operand:V64I 0 "register_operand" "=e")
>>> + (unspec:V64I [(match_operand:V64I 1 "register_operand" "e")
>>> + (match_operand:V64I 2 "register_operand" "e")]
>>> + UNSPEC_BSHUFFLE))
>>> + (use (reg:SI GSR_REG))]
>>
>> Better to push the use of the GSR_REG into the unspec, and not leave
>> it separate in the parallel.
>
> This is actually just a non-constant vec_merge, and even though the internals
> documentation says that the 'items' operand has to be a const_int, the compiler
> actually doesn't care.
Um, no it isn't.
The VEC_MERGE pattern uses N bits to select N elements from op0 and op1:
op0 = A B C D
op1 = W X Y Z
bmask = 0 1 0 1 = 3
result = A X C D
Your insn doesn't use single bits for the select. It uses nibbles to
select from the 16 input bytes. It's akin to the VEC_SELECT pattern,
except that VEC_SELECT requires a constant input parallel.
---
You might have a look at the "Vector Shuffle" thread, where we've been
trying to provide builtin-level access to this feature. We've not added
an rtx-level code for this because so far there isn't *that* much in
common between the various cpus. They all seem to differ in niggling
details...
You'll have a somewhat harder time than i386 for this feature, given
that you've got to pack bytes into nibbles. But it can certainly be done.
r~