This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, 2 of 4], Enhance PowerPC vec_extract support for power8/power9 machines


On Thu, Jul 28, 2016 at 04:57:53AM -0500, Segher Boessenkool wrote:
> On Wed, Jul 27, 2016 at 05:16:28PM -0400, Michael Meissner wrote:
> > 	* config/rs6000/vsx.md (UNSPEC_VSX_VSLO): New unspecs.
> > 	(UNSPEC_VSX_EXTRACT): Likewise.
> 
> "New unspec".

Thanks.

> > 	(VEC_EXTRACT_OPTIMIZE_P): New macro to say whether we can optmize
> > 	vec_extract on 64-bit ISA 2.07 systems and newer.
> 
> "optimize".

Thanks.

> > --- gcc/config/rs6000/rs6000-protos.h	(revision 238775)
> > +++ gcc/config/rs6000/rs6000-protos.h	(working copy)
> > @@ -62,6 +62,7 @@ extern void rs6000_expand_vector_init (r
> >  extern void paired_expand_vector_init (rtx, rtx);
> >  extern void rs6000_expand_vector_set (rtx, rtx, int);
> >  extern void rs6000_expand_vector_extract (rtx, rtx, rtx);
> > +extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx);
> >  extern bool altivec_expand_vec_perm_const (rtx op[4]);
> >  extern void altivec_expand_vec_perm_le (rtx op[4]);
> >  extern bool rs6000_expand_vec_perm_const (rtx op[4]);
> 
> This isn't in the changelog.

Yes it is.

2016-07-27  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_split_vec_extract_var):
	New declaration.

> > +      /* For little endian, adjust element ordering.  For V2DI/V2DF, we can use
> > +	 an XOR, otherwise we need to subtract.  The shift amount is so VSLO
> > +	 will shift the element into the upper position (adding 3 to convert a
> > +	 byte shift into a bit shift). */
> 
> Two spaces after dot.

Thanks.

> > +;; Variable V2DI/V2DF extract
> > +(define_insn_and_split "vsx_extract_<mode>_var"
> > +  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v")
> > +	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v")
> > +			     (match_operand:DI 2 "gpc_reg_operand" "r")]
> > +			    UNSPEC_VSX_EXTRACT))
> > +   (clobber (match_scratch:DI 3 "=r"))
> > +   (clobber (match_scratch:V2DI 4 "=&v"))]
> > +  "VECTOR_MEM_VSX_P (<MODE>mode) && VEC_EXTRACT_OPTIMIZE_P"
> > +  "#"
> > +  "&& reload_completed"
> > +  [(const_int 0)]
> > +{
> > +  rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
> > +				operands[3], operands[4]);
> > +  DONE;
> > +})
> 
> Why reload_completed?  Can it not run earlier?

This patch can perhaps run earlier, but the next patch that adds optimizing
memory references cannot be.

In order to change memory addresses, I need to know exactly which register set
(GPR, traditional floating point, and traditional Altivec register), and what
address modes they support.  Keeping it until after reload will also allow for
some flexibility if the vector was not in register, it can just access the
memory value, instead of forcing it to be in a register.

> > +/* Macro to say whether we can optimize vector extracts.  */
> > +#define VEC_EXTRACT_OPTIMIZE_P	(TARGET_DIRECT_MOVE			\
> > +				 && TARGET_POWERPC64			\
> > +				 && TARGET_UPPER_REGS_DI)
> 
> I'm not a big fan of this name.  "Optimize" will quickly become dated,
> everyone will take the current new hot thing for granted, and then when
> you want to optimise even more (say, for ISA 6.0 or whatever) the name
> is really out of place.
> 
> But I don't know a much better name either.

I changed it to TARGET_DIRECT_MOVE_64BIT, which hopefully is clearer of what
exactly we need.  In particular, the calculation of the bit shift is done in
the GPR and direct move creates teh vector used by VSLO to do a byte shift.

> > --- gcc/testsuite/gcc.target/powerpc/vec-extract-1.c	(revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/vec-extract-1.c	(revision 0)
> > @@ -0,0 +1,27 @@
> > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> 
> Maybe you can add a "run" test as well?

I added the run tests on July 21st that explicitly checks for just about every
combination that is being optimized by these paches to make sure it generates
the correct code.
 
> Looks good otherwise, okay for trunk with those nits fixed.

Here is the revised patch that I will check in after the tests are rerun:

[gcc, patch #2]
2016-07-28  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_split_vec_extract_var):
	New declaration.
	* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
	Add support for vec_extract of vector double or vector long having
	a variable element number on 64-bit ISA 2.07 systems or newer.
	* config/rs6000/rs6000.c (rs6000_expand_vector_extract):
	Likewise.
	(rs6000_split_vec_extract_var): New function to split a
	vec_extract built-in function with variable element number.
	(rtx_is_swappable_p): Variable vec_extracts and shifts are not
	swappable.
	* config/rs6000/vsx.md (UNSPEC_VSX_VSLO): New unspec.
	(UNSPEC_VSX_EXTRACT): Likewise.
	(vsx_extract_<mode>, VSX_D iterator): Fix constraints to allow
	direct move instructions to be generated on 64-bit ISA 2.07
	systems and newer, and to take advantage of the ISA 3.0 MFVSRLD
	instruction.
	(vsx_vslo_<mode>): New insn to do VSLO on V2DFmode and V2DImode
	arguments for vec_extract variable element.
	(vsx_extract_<mode>_var, VSX_D iterator): New insn to support
	vec_extract with variable element on V2DFmode and V2DImode
	vectors.
	* config/rs6000/rs6000.h (TARGET_VEXTRACTUB): Remove
	-mupper-regs-df requirement, since it isn't needed.
	(TARGET_DIRECT_MOVE_64BIT): New macro to say whether we can
	do direct moves on 64-bit systems, which allows optimization of
	vec_extract on 64-bit ISA 2.07 systems and newer.

[gcc/testsuite, patch #2]
2016-07-28  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/vec-extract-1.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: gcc-stage7.extract005b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]