This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, rs6000] Be careful with special permute masks for little endian
- From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
- To: David Edelsohn <dje dot gcc at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 21 Oct 2013 11:39:54 -0500
- Subject: Re: [PATCH, rs6000] Be careful with special permute masks for little endian
- Authentication-results: sourceware.org; auth=none
- References: <1382359742 dot 6275 dot 162 dot camel at gnopaine> <CAGWvnyn1Z2drQV2g=+1fO0bwY3zxT19zrErjJnRg7vMUievTXA at mail dot gmail dot com>
Whoops, looks like I missed some simpler cases (REG with the wrong mode
instead of SUBREG with the wrong mode). Is this revised version ok,
assuming it passes testing? It should fix a few more test cases.
The changed code from the previous version is in the last hunk.
Thanks,
Bill
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 203792)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
{ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
{ 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
{ 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
{ 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
{ 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
{ 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
{ 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
- { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
+ { OPTION_MASK_ALTIVEC,
+ BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
{ 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
{ OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
{ 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } },
@@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4])
enum machine_mode omode = insn_data[icode].operand[0].mode;
enum machine_mode imode = insn_data[icode].operand[1].mode;
+ /* For little-endian, don't use vpkuwum and vpkuhum if the
+ underlying vector type is not V4SI and V8HI, respectively.
+ For example, using vpkuwum with a V8HI picks up the even
+ halfwords (BE numbering) when the even halfwords (LE
+ numbering) are what we need. */
+ if (!BYTES_BIG_ENDIAN
+ && icode == CODE_FOR_altivec_vpkuwum
+ && ((GET_CODE (op0) == REG
+ && GET_MODE (op0) != V4SImode)
+ || (GET_CODE (op0) == SUBREG
+ && GET_MODE (XEXP (op0, 0)) != V4SImode)))
+ continue;
+ if (!BYTES_BIG_ENDIAN
+ && icode == CODE_FOR_altivec_vpkuhum
+ && ((GET_CODE (op0) == REG
+ && GET_MODE (op0) != V8HImode)
+ || (GET_CODE (op0) == SUBREG
+ && GET_MODE (XEXP (op0, 0)) != V8HImode)))
+ continue;
+
/* For little-endian, the two input operands must be swapped
(or swapped back) to ensure proper right-to-left numbering
from 0 to 2N-1. */
On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote:
> On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> > Hi,
> >
> > In altivec_expand_vec_perm_const, we look for special masks that match
> > the behavior of specific instructions, so we can use those instructions
> > rather than load a constant control vector and perform a permute. Some
> > of the masks must be treated differently for little endian mode.
> >
> > The masks that represent merge-high and merge-low operations have
> > reversed meanings in little-endian, because of the reversed ordering of
> > the vector elements.
> >
> > The masks that represent vector-pack operations remain correct when the
> > mode of the input operands matches the natural mode of the instruction,
> > but not otherwise. This is because the pack instructions always select
> > the rightmost, low-order bits of the vector element. There are cases
> > where we use this, for example, with a V8SI vector matching a vpkuwum
> > mask in order to select the odd-numbered elements of the vector. In
> > little endian mode, this instruction will get us the even-numbered
> > elements instead. There is no alternative instruction with the desired
> > behavior, so I've just disabled use of those masks for little endian
> > when the mode isn't natural.
> >
> > These changes fix 32 failures in the test suite for little endian mode.
> > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
> > failures. Is this ok for trunk?
> >
> > Thanks,
> > Bill
> >
> >
> > 2013-10-21 Bill Schmidt <wschmidt@vnet.ibm.com>
> >
> > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
> > meaning of merge-high and merge-low masks for little endian; avoid
> > use of vector-pack masks for little endian for mismatched modes.
>
> Okay.
>
> Thanks, David
>