This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Vector shuffling


On 08/15/2010 03:09 PM, Richard Guenther wrote:
> On the tree level we generally express target dependent features via
> builtins.  What tree code are you thinking of?  We have vector lowering
> for target unsupported stuff to allow optimizing - would that new tree code
> be target specific then (in that it appears only when target support
> is available)?
> 
> I think the hurdle to add a new tree code should be large - otherwise we'll
> just accumulate a mess.

In this case I think that a tree code would be best.

The problem is that the original shuffle is overloaded for all
vector types.  Which means that a single builtin function cannot
be type correct.  Since the user can define arbitrary vector 
types, and __builtin_shuffle is supposed to be generic, we cannot
possibly pre-define all of the decls required.

(Given that Artem doesn't introduce such a tree code and only
two builtins suggests that his testing is incomplete, because
this really ought to have failed in verify_types_in_gimple_stmt
somewhere.)

The big question is what type on which to define the permutation
vector.  While it is logical to use an integral vector of the 
same width as the output vector, the variable permutation case
for both x86 and powerpc would prefer to permute on bytes and
not the original element types.  Further, the constant permute
case for x86 would prefer to permute on the original element
types, and not have to re-interpret a byte permutation back into
the original element types.  If we always use either byte or
always use element permute then we'll have duplicate code in
the backends to compensate.  Better to handle both forms in the
middle-end, and ask the backend which is preferred.

Which suggests something like

  VEC_PERM_ELT (V1, V2, EMASK)
  VEC_PERM_BYTE (V1, V2, BMASK)

where EMASK is element based indicies and BMASK is byte based
indicies.  A target hook would determine if VEC_PERM_ELT or
VEC_PERM_BYTE is preferred or possible for a given permutation.

Permutations originating from the user via __builtin_shuffle
would originally be represented as VEC_PERM_ELT, and would be
lowered to VEC_PERM_BYTE in tree-vect-generic.c as required by
the aforementioned target hook.

Permutations originating from the vectorizer would be in the
desired form to begin with.  Given that it can already handle
relatively arbitrary MASK_TYPE, it should not be difficult to
modify the existing code to use the new tree codes.

This would allow quite a bit of cleanup in both the x86 and
the powerpc backends in my opinion.  At the moment we have an
ugly proliferation of target-specific permutation builtins
which serve no purpose except to satisfy type correctness.



r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]