Vector shuffling

Richard Guenther richard.guenther@gmail.com
Tue Aug 17 09:36:00 GMT 2010


On Mon, Aug 16, 2010 at 8:53 PM, Richard Henderson <rth@redhat.com> wrote:
> On 08/15/2010 03:09 PM, Richard Guenther wrote:
>> On the tree level we generally express target dependent features via
>> builtins.  What tree code are you thinking of?  We have vector lowering
>> for target unsupported stuff to allow optimizing - would that new tree code
>> be target specific then (in that it appears only when target support
>> is available)?
>>
>> I think the hurdle to add a new tree code should be large - otherwise we'll
>> just accumulate a mess.
>
> In this case I think that a tree code would be best.
>
> The problem is that the original shuffle is overloaded for all
> vector types.  Which means that a single builtin function cannot
> be type correct.  Since the user can define arbitrary vector
> types, and __builtin_shuffle is supposed to be generic, we cannot
> possibly pre-define all of the decls required.
>
> (Given that Artem doesn't introduce such a tree code and only
> two builtins suggests that his testing is incomplete, because
> this really ought to have failed in verify_types_in_gimple_stmt
> somewhere.)

The C frontend pieces build new type-correct function decls at parsing
time.

> The big question is what type on which to define the permutation
> vector.  While it is logical to use an integral vector of the
> same width as the output vector, the variable permutation case
> for both x86 and powerpc would prefer to permute on bytes and
> not the original element types.  Further, the constant permute
> case for x86 would prefer to permute on the original element
> types, and not have to re-interpret a byte permutation back into
> the original element types.  If we always use either byte or
> always use element permute then we'll have duplicate code in
> the backends to compensate.  Better to handle both forms in the
> middle-end, and ask the backend which is preferred.
>
> Which suggests something like
>
>  VEC_PERM_ELT (V1, V2, EMASK)
>  VEC_PERM_BYTE (V1, V2, BMASK)
>
> where EMASK is element based indicies and BMASK is byte based
> indicies.  A target hook would determine if VEC_PERM_ELT or
> VEC_PERM_BYTE is preferred or possible for a given permutation.
>
> Permutations originating from the user via __builtin_shuffle
> would originally be represented as VEC_PERM_ELT, and would be
> lowered to VEC_PERM_BYTE in tree-vect-generic.c as required by
> the aforementioned target hook.

That sounds like a good idea.

Richard.



More information about the Gcc-patches mailing list