This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH #2] Add PowerPC ISA 3.0 word splat and byte immediate splat support


On Wed, May 18, 2016 at 07:15:10AM -0500, Segher Boessenkool wrote:
> On Tue, May 17, 2016 at 06:45:49PM -0400, Michael Meissner wrote:
> > As I mentioned in the last message, my previous patch had some problems that
> > showed up on big endian systems, using RELOAD (one of the tests that failed was
> > the vshuf-v32qi.c test in the testsuite).  Little endian and IRA did compiled
> > the test fine.  This patch fixes the problem.  I went over the alternatives in
> > the vsx_mov<mode>_{32bit,64bit} patterns, and I removed the '*' constraints,
> > and checked all of the other constraints.
> 
> So those * is the only change?

In this last patch, I started with the constraints from the current
vsx_mov<mode> before the patch.  As documented in the ChangeLog, I made the
following changes:

    1)	I collapsed the <VSr>,?<VSa> cases to just <VSa> (i.e. eliminating the
	idea in the move of a favored register constraint compared).

    2)	Add XXSPLTIB support.

    3)	Prefer using VSPLTIWS for creating 0/-1 over XXLXOR/XXLORC.

> 
> > The patches bootstrap and pass regression tests on both little endian power8
> > and big endian power7 systems.  Are these patches ok to install in the trunk?
> 
> This patch is okay for trunk.
> 
> > After a burn-in period, will they be ok to back port to the GCC 6.2 branch?
> 
> Yes.  Please make sure you test it there as well (BE/LE, p7/p8).

Yes.

> A few nits still:
> 
> > +(define_predicate "xxspltib_constant_split"
> > +  (match_code "const_vector,vec_duplicate,const_int")
> > +{
> > +  int value = 256;
> > +  int num_insns	= -1;
> 
> Still has a tab character.

Fixed.

> > +(define_predicate "xxspltib_constant_nosplit"
> > +  (match_code "const_vector,vec_duplicate,const_int")
> > +{
> > +  int value = 256;
> > +  int num_insns	= -1;
> 
> And here.

Fixed.

> > @@ -1024,6 +1068,10 @@ (define_predicate "splat_input_operand"
> >  	mode = V2DFmode;
> >        else if (mode == DImode)
> >  	mode = V2DImode;
> > +      else if (mode == SImode && TARGET_P9_VECTOR)
> > +	mode = V4SImode;	
> > +      else if (mode == SFmode && TARGET_P9_VECTOR)
> > +	mode = V4SFmode;	
> 
> Trailing tabs (twice).

Fixed.

> > +;;		VSX store  VSX load   VSX move  VSX->GPR   GPR->VSX    LQ (GPR)
> > +;;              STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB    VSPLTISW
> > +;;		VSX 0/-1   GPR 0/-1   VMX const GPR const  LVX (VMX)   STVX (VMX)
> > +(define_insn "*vsx_mov<mode>_64bit"
> > +  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
> > +               "=ZwO,      <VSa>,     <VSa>,     r,         we,        ?wQ,
> > +		?&r,       ??r,       ??Y,       ??r,       wo,        v,
> > +		?<VSa>,    *r,        v,         ??r,       wZ,        v")
> > +
> > +	(match_operand:VSX_M 1 "input_operand" 
> > +               "<VSa>,     ZwO,       <VSa>,     we,        r,         r,
> > +		wQ,        Y,         r,         r,         wE,        jwM,
> > +		?jwM,      jwM,       W,         W,         v,         wZ"))]
> > +
> > +  "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (<MODE>mode)
> > +   && (register_operand (operands[0], <MODE>mode) 
> > +       || register_operand (operands[1], <MODE>mode))"
> > +{
> > +  return rs6000_output_move_128bit (operands);
> > +}
> > +  [(set_attr "type"
> > +               "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
> > +		store,     load,      store,     *,         vecsimple, vecsimple,
> > +		vecsimple, *,         *,         *,         vecstore,  vecload")
> > +
> > +   (set_attr "length"
> > +               "4,         4,         4,         8,         4,         8,
> > +		8,         8,         8,         8,         4,         4,
> > +		4,         8,         20,        20,        4,         4")])
> 
> Some of these lines are indented with spaces instead of tabs, please fix.
> Looks great otherwise, thanks!

Fixed.

> > +;; V4SI splat (ISA 3.0)
> > +;; When SI's are allowed in VSX registers, add XXSPLTW support
> > +(define_expand "vsx_splat_<mode>"
> > +  [(set (match_operand:VSX_W 0 "vsx_register_operand" "")
> > +	(vec_duplicate:VSX_W
> > +	 (match_operand:<VS_scalar> 1 "splat_input_operand" "")))]
> 
> You can leave off the default arg "" nowadays.  I know we're not consistent
> in that.

I tend to be conservative, since I do have to backport patches to older
branches.

Here is the patch that I checked in subversion id 236394:

[gcc]
2016-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/70915
	* config/rs6000/constraints.md (wE constraint): New constraint
	for a vector constant that can be loaded with XXSPLTIB.
	(wM constraint): New constraint for a vector constant of a 1's.
	(wS constraint): New constraint for a vector constant that can be
	loaded with XXSPLTIB and a vector sign extend instruction.
	* config/rs6000/predicates.md (xxspltib_constant_split): New
	predicates for wE/wS constraints.
	(xxspltib_constant_nosplit): Likewise.
	(easy_vector_constant): Add support for constants that can be
	loaded via XXSPLTIB.
	(all_ones_constant): New predicate for vector constant with all
	1's set.
	(splat_input_operand): Add support for ISA 3.0 word splat
	operations.
	* config/rs6000/rs6000.c (xxspltib_constant_p): New function to
	return if a constant can be loaded with the ISA 3.0 XXSPLTIB
	instruction and possibly with a sign extension.
	(output_vec_const_move): Add support for XXSPLTIB. If we are
	loading up 0/-1 into Altivec registers, prefer using VSPLTISW
	instead of XXLXOR/XXLORC.
	(rs6000_expand_vector_init): Add support for ISA 3.0 word splat
	operations.
	(rs6000_legitimize_reload_address): Likewise.
	(rs6000_output_move_128bit): Use output_vec_const_move to emit
	constants.
	* config/rs6000/vsx.md (VSX_M): Add TImode (if -mvsx-timode) and
	combine VSX_M and VSX_M2 into one iterator.
	(VSX_M2): Likewise.
	(VSINT_84): New iterators for loading constants with XXSPLTIB.
	(VSINT_842): Likewise.
	(UNSPEC_VSX_SIGN_EXTEND): New UNSPEC.
	(xxspltib_v16qi): New insns to load up constants with the ISA 3.0
	XXSPLTIB instruction.
	(xxspltib_<mode>_nosplit): Likewise.
	(xxspltib_<mode>_split): New insn to load up constants with
	XXSPLTIB and a sign extend instruction.
	(vsx_mov<mode>): Replace single move that handled all vector types
	with separate 32-bit and 64-bit moves.  Combine the movti_<bit>
	moves (when -mvsx-timode is in effect) into the main vector
	moves.  Eliminate separate moves for <VSr> <VSa>, where the
	preferred register class (<VSr>) is listed first, and the
	secondary register class (<VSa>) is listed second with a '?' to
	discourage use.  Prefer loading 0/-1 in any VSX register for ISA
	3.0, and Altivec registers for ISA 2.06/2.07 (PR target/70915) so
	that if the register was involved in a slow operation, the
	clear/set operation does not wait for the slow operation to
	finish.  Adjust the length attributes for 32-bit mode.  Use
	rs6000_output_move_128bit and drop the use of the string
	instructions for 32-bit movti when -mvsx-timode is in effect.  Use
	spacing so that the alternatives and attributes don't generate
	long lines, and put things in columns, so that it is easier to
	match up the operands and attributes with the insn alternatives.
	(vsx_mov<mode>_64bit): Likewise.
	(vsx_mov<mode>_32bit): Likewise.
	(vsx_movti_64bit): Fold movti into normal vector moves.
	(vsx_movti_32bit): Likewise.
	(vsx_splat_<mode>, V4SI/V4SF modes): Add support for ISA 3.0 word
	spat instructions.
	(vsx_splat_v4si_internal): Likewise.
	(vsx_splat_v4sf_internal): Likewise.
	(vector fusion peepholes): Use VSX_M instead of VSX_M2.
	(vsx_sign_extend_qi_<mode>): New ISA 3.0 instructions to sign
	extend vector elements.
	(vsx_sign_extend_hi_<mode>): Likewise.
	(vsx_sign_extend_si_v2di): Likewise.
	* config/rs6000/rs6000-protos.h (xxspltib_constant_p): Add
	declaration.
	* doc/md.texi (PowerPC constraints): Document the wE, wM, and wS
	constraints.  Add trailing period to wL documentation.

[gcc/testsuite]
2016-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-splat-1.c: New tests for ISA 3.0 word
	splat operations and the XXSPLTIB instruction.
	* gcc.target/powerpc/p9-splat-2.c: Likewise.
	* gcc.target/powerpc/p9-splat-3.c: Likewise.
	* gcc.target/powerpc/pr47755.c: Allow vspltisw in addition to
	xxlxor to clear a register.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: gcc-stage7.splat004b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]