This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, rs6000] Fix variable permute control vectors for little endian


On Wed, Oct 9, 2013 at 7:11 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> This is a follow-up to the recent patch that fixed constant permute
> control vectors for little endian.  When the control vector is constant,
> we can adjust the constant and use a vperm without increasing code size.
> When the control vector is unknown, however, we have to generate two
> additional instructions to subtract each element of the control vector
> from 31 (equivalently, from -1, since only 5 bits are pertinent).  This
> patch adds the additional code generation.
>
> There are two main paths to the affected permutes:  via the known
> pattern vec_perm<mode>, and via an altivec builtin.  The builtin path
> causes a little difficulty because there's no way to dispatch a builtin
> to two different insns for BE and LE.  I solved this by adding two new
> unspecs for the builtins (UNSPEC_VPERM_X and UNSPEC_VPERM_UNS_X).  The
> insns for the builtins are changed from a define_insn to a
> define_insn_and_split.  We create the _X forms at expand time and later
> split them into the correct sequences for BE and LE, using the "real"
> UNSPEC_VPERM and UNSPEC_VPERM_UNS to generate the vperm instruction.
>
> For the path via the known pattern, I added a new routine in rs6000.c in
> similar fashion to the solution for the constant control vector case.
>
> When the permute control vector is a rotate vector loaded by lvsl or
> lvsr, we can generate the desired control vector more cheaply by simply
> changing to use the opposite instruction.  We are already doing that
> when expanding an unaligned load.  The changes in vector.md avoid
> undoing that effort by circumventing the subtract-from-splat (going
> straight to the UNSPEC_VPERM).
>
> I bootstrapped and tested this for big endian on
> powerpc64-unknown-linux-gnu with no new regressions.  I did the same for
> little endian on powerpc64le-unknown-linux-gnu.  Here the results were
> slightly mixed: the changes fix 32 test failures, but expose an
> unrelated bug in 9 others when -mvsx is permitted on LE (not currently
> allowed).  The bug is a missing permute for a vector load in the
> unaligned vector load logic that will be fixed in a subsequent patch.
>
> Is this okay for trunk?
>
> Thanks,
> Bill
>
>
> 2013-10-09  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/vector.md (vec_realign_load<mode>): Generate vperm
>         directly to circumvent subtract from splat{31} workaround.
>         * config/rs6000/rs6000-protos.h (altivec_expand_vec_perm_le): New
>         prototype.
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_le): New.
>         * config/rs6000/altivec.md (define_c_enum "unspec"): Add
>         UNSPEC_VPERM_X and UNSPEC_VPERM_UNS_X.
>         (altivec_vperm_<mode>): Convert to define_insn_and_split to
>         separate big and little endian logic.
>         (*altivec_vperm_<mode>_internal): New define_insn.
>         (altivec_vperm_<mode>_uns): Convert to define_insn_and_split to
>         separate big and little endian logic.
>         (*altivec_vperm_<mode>_uns_internal): New define_insn.
>         (vec_permv16qi): Add little endian logic.

Okay.

Thanks, David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]