This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, rs6000] Be careful with special permute masks for little endian
- From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
- To: David Edelsohn <dje dot gcc at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 21 Oct 2013 15:35:11 -0500
- Subject: Re: [PATCH, rs6000] Be careful with special permute masks for little endian
- Authentication-results: sourceware.org; auth=none
- References: <1382359742 dot 6275 dot 162 dot camel at gnopaine> <CAGWvnyn1Z2drQV2g=+1fO0bwY3zxT19zrErjJnRg7vMUievTXA at mail dot gmail dot com> <1382373594 dot 6275 dot 164 dot camel at gnopaine>
Please hold off reviewing this. I see at least one testcase that will
have to be modified (expected code generation pattern will be different
for LE vs. BE). I'll resubmit the whole thing later today.
Thanks,
Bill
On Mon, 2013-10-21 at 11:39 -0500, Bill Schmidt wrote:
> Whoops, looks like I missed some simpler cases (REG with the wrong mode
> instead of SUBREG with the wrong mode). Is this revised version ok,
> assuming it passes testing? It should fix a few more test cases.
>
> The changed code from the previous version is in the last hunk.
>
> Thanks,
> Bill
>
>
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c (revision 203792)
> +++ gcc/config/rs6000/rs6000.c (working copy)
> @@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
> { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
> { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
> { 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
> { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
> { 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
> { 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
> { 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
> { 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
> - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
> + { OPTION_MASK_ALTIVEC,
> + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
> { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
> { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
> { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } },
> @@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4])
> enum machine_mode omode = insn_data[icode].operand[0].mode;
> enum machine_mode imode = insn_data[icode].operand[1].mode;
>
> + /* For little-endian, don't use vpkuwum and vpkuhum if the
> + underlying vector type is not V4SI and V8HI, respectively.
> + For example, using vpkuwum with a V8HI picks up the even
> + halfwords (BE numbering) when the even halfwords (LE
> + numbering) are what we need. */
> + if (!BYTES_BIG_ENDIAN
> + && icode == CODE_FOR_altivec_vpkuwum
> + && ((GET_CODE (op0) == REG
> + && GET_MODE (op0) != V4SImode)
> + || (GET_CODE (op0) == SUBREG
> + && GET_MODE (XEXP (op0, 0)) != V4SImode)))
> + continue;
> + if (!BYTES_BIG_ENDIAN
> + && icode == CODE_FOR_altivec_vpkuhum
> + && ((GET_CODE (op0) == REG
> + && GET_MODE (op0) != V8HImode)
> + || (GET_CODE (op0) == SUBREG
> + && GET_MODE (XEXP (op0, 0)) != V8HImode)))
> + continue;
> +
> /* For little-endian, the two input operands must be swapped
> (or swapped back) to ensure proper right-to-left numbering
> from 0 to 2N-1. */
>
>
> On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote:
> > On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
> > <wschmidt@linux.vnet.ibm.com> wrote:
> > > Hi,
> > >
> > > In altivec_expand_vec_perm_const, we look for special masks that match
> > > the behavior of specific instructions, so we can use those instructions
> > > rather than load a constant control vector and perform a permute. Some
> > > of the masks must be treated differently for little endian mode.
> > >
> > > The masks that represent merge-high and merge-low operations have
> > > reversed meanings in little-endian, because of the reversed ordering of
> > > the vector elements.
> > >
> > > The masks that represent vector-pack operations remain correct when the
> > > mode of the input operands matches the natural mode of the instruction,
> > > but not otherwise. This is because the pack instructions always select
> > > the rightmost, low-order bits of the vector element. There are cases
> > > where we use this, for example, with a V8SI vector matching a vpkuwum
> > > mask in order to select the odd-numbered elements of the vector. In
> > > little endian mode, this instruction will get us the even-numbered
> > > elements instead. There is no alternative instruction with the desired
> > > behavior, so I've just disabled use of those masks for little endian
> > > when the mode isn't natural.
> > >
> > > These changes fix 32 failures in the test suite for little endian mode.
> > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
> > > failures. Is this ok for trunk?
> > >
> > > Thanks,
> > > Bill
> > >
> > >
> > > 2013-10-21 Bill Schmidt <wschmidt@vnet.ibm.com>
> > >
> > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
> > > meaning of merge-high and merge-low masks for little endian; avoid
> > > use of vector-pack masks for little endian for mismatched modes.
> >
> > Okay.
> >
> > Thanks, David
> >