This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts
- From: Michael Meissner <meissner at linux dot vnet dot ibm dot com>
- To: Segher Boessenkool <segher at kernel dot crashing dot org>
- Cc: Michael Meissner <meissner at linux dot vnet dot ibm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, David Edelsohn <dje dot gcc at gmail dot com>, Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
- Date: Mon, 31 Jul 2017 13:38:00 -0400
- Subject: Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts
- Authentication-results: sourceware.org; auth=none
- References: <20170727232113.GA8723@ibm-tiger.the-meissners.org> <20170728210848.GC13471@gate.crashing.org>
On Fri, Jul 28, 2017 at 04:08:50PM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 27, 2017 at 07:21:14PM -0400, Michael Meissner wrote:
> > This patches optimizes the PowerPC vector set operation for 64-bit doubles and
> > longs where the elements in the vector set may have been extracted from another
> > vector (PR target/81593):
>
> > * config/rs6000/rs6000.c (rs6000_emit_xxpermdi): New function to
> > emit XXPERMDI accessing either double word in either vector
> > register inputs.
>
> "emit" is not a good name for this: that is generally used for something
> that does emit_insn, i.e. put an insn in the instruction stream. This
> function returns a string a define_insn can return. For the rl* insns
> I called the similar functions rs6000_insn_for_*, maybe something like
> that is better here?
Yeah, I should have used rs6000_output_xxpermdi or similar (or output_xxpermdi,
etc.), which is what the other functions used.
> > +/* Emit a XXPERMDI instruction that can extract from either double word of the
> > + two arguments. ELEMENT1 and ELEMENT2 are either NULL or they are 0/1 giving
> > + which double word to be used for the operand. */
> > +
> > +const char *
> > +rs6000_emit_xxpermdi (rtx operands[], rtx element1, rtx element2)
> > +{
> > + int op1_dword = (!element1) ? 0 : INTVAL (element1);
> > + int op2_dword = (!element2) ? 0 : INTVAL (element2);
> > +
> > + gcc_assert (IN_RANGE (op1_dword | op2_dword, 0, 1));
> > +
> > + if (BYTES_BIG_ENDIAN)
> > + {
> > + operands[3] = GEN_INT (2*op1_dword + op2_dword);
> > + return "xxpermdi %x0,%x1,%x2,%3";
> > + }
> > + else
> > + {
> > + if (element1)
> > + op1_dword = 1 - op1_dword;
> > +
> > + if (element2)
> > + op2_dword = 1 - op2_dword;
> > +
> > + operands[3] = GEN_INT (op1_dword + 2*op2_dword);
> > + return "xxpermdi %x0,%x2,%x1,%3";
> > + }
> > +}
>
> I think calling this with the rtx elementN args makes this only more
> complicated (the function comment doesn't say what they are or what
> NULL means, btw).
Ok, let me think on it.
>
> > (define_insn "vsx_concat_<mode>"
> > - [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=<VSa>,we")
> > + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we")
> > (vec_concat:VSX_D
> > - (match_operand:<VS_scalar> 1 "gpc_reg_operand" "<VS_64reg>,b")
> > - (match_operand:<VS_scalar> 2 "gpc_reg_operand" "<VS_64reg>,b")))]
> > + (match_operand:<VS_scalar> 1 "gpc_reg_operand" "wa,b")
> > + (match_operand:<VS_scalar> 2 "gpc_reg_operand" "wa,b")))]
> > "VECTOR_MEM_VSX_P (<MODE>mode)"
> > {
> > if (which_alternative == 0)
> > - return (BYTES_BIG_ENDIAN
> > - ? "xxpermdi %x0,%x1,%x2,0"
> > - : "xxpermdi %x0,%x2,%x1,0");
> > + return rs6000_emit_xxpermdi (operands, NULL_RTX, NULL_RTX);
> >
> > else if (which_alternative == 1)
> > - return (BYTES_BIG_ENDIAN
> > + return (VECTOR_ELT_ORDER_BIG
> > ? "mtvsrdd %x0,%1,%2"
> > : "mtvsrdd %x0,%2,%1");
>
> This one could be
>
> {
> if (!BYTES_BIG_ENDIAN)
> std::swap (operands[1], operands[2]);
>
> switch (which_alternative)
> {
> case 0:
> return "xxpermdi %x0,%x1,%x2,0";
> case 1:
> return "mtvsrdd %x0,%1,%2";
> default:
> gcc_unreachable ();
> }
> }
> (Could/should we use xxmrghd instead? Do all supported assemblers know
> that extended mnemonic, is it actually more readable?)
For me no, xxpermdi is clearer. But if you want xxmrghd, I can do it.
> > --- gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c (svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c) (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c (.../gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c) (revision 250640)
> > @@ -0,0 +1,15 @@
> > +/* { dg-do compile { target { powerpc*-*-* } } } */
> > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > +/* { dg-require-effective-target powerpc_vsx_ok } */
> > +/* { dg-options "-O2 -mvsx" } */
> > +
> > +vector double
> > +test_vpasted (vector double high, vector double low)
> > +{
> > + vector double res;
> > + res[1] = high[1];
> > + res[0] = low[0];
> > + return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */
>
> In this and the other testcase, should you test no other insns at all
> are generated?
It is kind of hard to test a negative, without trying to guess what possible
instructions could be generated.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797