This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Vector Shuffle plans


On Mon, Oct 3, 2011 at 7:07 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/03/2011 10:42 AM, David Miller wrote:
>>> You might have a look at the "Vector Shuffle" thread, where we've been
>>> trying to provide builtin-level access to this feature. ?We've not added
>>> an rtx-level code for this because so far there isn't *that* much in
>>> common between the various cpus. ?They all seem to differ in niggling
>>> details...
>>>
>>> You'll have a somewhat harder time than i386 for this feature, given
>>> that you've got to pack bytes into nibbles. ?But it can certainly be done.
>>
>> Ok, I'll take a look.
>
> Oh, you should know that, at present, our generic shuffle support assumes
> that shuffles with a constant control (which are also generated by the
> vectorizer) get expanded to builtins. ?And as builtins we wind up with
> lots of them -- one per type.
>
> I'm going to start fixing that in the coming week.
>
> The vectorizer will be changed to emit VEC_SHUFFLE_EXPR. ?It will still use
> the target hook to see if the constant shuffle is supported.
>
> The lower-vector pass currently tests the target hook and swaps the
> VEC_SHUFFLE_EXPRs that are validate into builtins. ?That will be changed
> to simply leave them unchanged if the other target hook returns NULL.
> As the targets are updated to use vshuffle, the builtins get deleted
> to return NULL. ?After all targets are updated, we can remove this check
> and the target hook itself. ?This should preserve bisection on each of
> the affected targets.
>
> The rtl expander won't have to change.
>
> The target backends will need to accept an immediate for vshuffle op3,
> if anything special ought to be done for constant shuffles. ?In addition,
> the builtins should be removed, as previously noted.
>
>
> r~
>

Several orthogonal vector-shuffling issues.

Currently if vec_perm_ok returns false, we do not try to use a new
vshuffle routine. Would it make sense to implement that? The only
potential problem I can see is a possible performance degradation.
This leads us to the second issue.

When we perform vshuffle, we need to know whether it make sense to use
pshufb (in case of x86) or to perform data movement via standard
non-simd registers. Do we have this information in the current
cost-model? Also, in certain cases, when the mask is constant, I would
assume the memory movement is also faster. For example if the mask is
{4,5,6,7,0,1,2,3...}, then two integer moves should do a better job.
Were there any attempts to perform such an analysis, and if not,
should we formalise the cases when the substitution of sorts would
make some sense.


Thanks,
Artem.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]