[PATCH 2/3] rs6000: Add MMA built-in function definitions

will schmidt will_schmidt@vnet.ibm.com
Mon Jun 15 22:43:54 GMT 2020


On Mon, 2020-06-15 at 14:58 -0500, Peter Bergner via Gcc-patches wrote:
> This patches adds the actual MMA built-ins.  The MMA accumulators are
> INOUT
> operands for most MMA instructions, but they are also very expensive
> to
> move around.  For this reason, we have implemented a built-in API
> where the accumulators are passed using pass-by-reference/pointers,
> so
> the user won't use one accumulator as input and another as output,
> which would entail a lot of copies.  However, using pointers gives us
> poor code generation when we expand the built-ins at normal expand
> time.
> We therefore expand the MMA built-ins early into gimple, converting
> the pass-by-reference calls to an internal built-in that uses pass-
> by-value
> calling convention, where we can enforce the input and output
> accumulators
> are the same.  This gives us much better code generation.
> 
> The associated test cases for these built-ins are in patch3.
> 
> This patch plus patch1 passed bootstrap and regtesting with no
> regressions
> on both powerpc64le-linux and powerpc64-linux.  Ok for trunk?
> 
> Peter
> 
> 2020-06-15  Peter Bergner  <bergner@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/predicates.md (mma_input_operand): New
> predicate.
> 	* config/rs6000/rs6000-builtin.def (BU_MMA_1, BU_MMA_V2,
> BU_MMA_3,
> 	BU_MMA_5, BU_MMA_6, BU_VSX_1): Add support macros for defining
> MMA
> 	built-in functions.
> 	(ASSEMBLE_ACC, ASSEMBLE_PAIR, DISASSEMBLE_ACC,
> DISASSEMBLE_PAIR,
> 	PMXVBF16GER2, PMXVBF16GER2NN, PMXVBF16GER2NP, PMXVBF16GER2PN,
> 	PMXVBF16GER2PP, PMXVF16GER2, PMXVF16GER2NN, PMXVF16GER2NP,
> 	PMXVF16GER2PN, PMXVF16GER2PP, PMXVF32GER, PMXVF32GERNN,
> 	PMXVF32GERNP, PMXVF32GERPN, PMXVF32GERPP, PMXVF64GER,
> PMXVF64GERNN,
> 	PMXVF64GERNP, PMXVF64GERPN, PMXVF64GERPP, PMXVI16GER2,
> PMXVI16GER2PP,
> 	PMXVI16GER2S, PMXVI16GER2SPP, PMXVI4GER8, PMXVI4GER8PP,
> PMXVI8GER4,
> 	PMXVI8GER4PP, PMXVI8GER4SPP, XVBF16GER2, XVBF16GER2NN,
> XVBF16GER2NP,
> 	XVBF16GER2PN, XVBF16GER2PP, XVCVBF16SP, XVCVSPBF16, XVF16GER2,
> 	XVF16GER2NN, XVF16GER2NP, XVF16GER2PN, XVF16GER2PP, XVF32GER,
> 	XVF32GERNN, XVF32GERNP, XVF32GERPN, XVF32GERPP, XVF64GER,
> XVF64GERNN,
> 	XVF64GERNP, XVF64GERPN, XVF64GERPP, XVI16GER2, XVI16GER2PP,
> XVI16GER2S,
> 	XVI16GER2SPP, XVI4GER8, XVI4GER8PP, XVI8GER4, XVI8GER4PP,
> XVI8GER4SPP,
> 	XXMFACC, XXMTACC, XXSETACCZ): Add MMA built-ins.

checked noses, all have been found below. 

> 	* config/rs6000/rs6000.c (rs6000_emit_move): Allow zero
> constants.
> 	(print_operand) <case 'A'>: New output modifier.
> 	(rs6000_split_multireg_move): Add support for inserting
> accumulator
> 	priming and depriming instructions.  Add support for splitting
> an
> 	assemble accumulator pattern.
> 	* config/rs6000/rs6000-call.c (mma_init_builtins,
> mma_expand_builtin,
> 	rs6000_gimple_fold_mma_builtin): New functions.
> 	(RS6000_BUILTIN_M): New macro.
> 	(def_builtin): Handle RS6000_BTC_QUAD and RS6000_BTC_PAIR
> attributes.
> 	(bdesc_mma): Add new MMA built-in support.
> 	(htm_expand_builtin): Use RS6000_BTC_OPND_MASK.
> 	(rs6000_invalid_builtin): Add handling of RS6000_BTM_FUTURE and
> 	RS6000_BTM_MMA.
> 	(rs6000_builtin_valid_without_lhs): Handle RS6000_BTC_VOID
> attribute.
> 	(rs6000_gimple_fold_builtin): Call
> rs6000_builtin_is_supported_p
> 	and rs6000_gimple_fold_mma_builtin.
> 	(rs6000_expand_builtin): Call mma_expand_builtin.
> 	Use RS6000_BTC_OPND_MASK.
> 	(rs6000_init_builtins): Adjust comment.  Call
> mma_init_builtins.
> 	(htm_init_builtins): Use RS6000_BTC_OPND_MASK.
> 	(builtin_function_type): Handle VSX_BUILTIN_XVCVSPBF16 and
> 	VSX_BUILTIN_XVCVBF16SP.
> 	* config/rs6000/rs6000.h (RS6000_BTC_QUINARY,
> RS6000_BTC_SENARY,
> 	RS6000_BTC_OPND_MASK, RS6000_BTC_QUAD, RS6000_BTC_PAIR,
> 	RS6000_BTC_QUADPAIR, RS6000_BTC_GIMPLE): New defines.
> 	(RS6000_BTC_PREDICATE, RS6000_BTC_ABS, RS6000_BTC_DST,
> 	RS6000_BTC_TYPE_MASK, RS6000_BTC_ATTR_MASK): Adjust values.
> 	* config/rs6000/mma.md (MAX_MMA_OPERANDS): New define_constant.
> 	(UNSPEC_MMA_ASSEMBLE_ACC, UNSPEC_MMA_PMXVBF16GER2,
> 	UNSPEC_MMA_PMXVBF16GER2NN, UNSPEC_MMA_PMXVBF16GER2NP,
> 	UNSPEC_MMA_PMXVBF16GER2PN, UNSPEC_MMA_PMXVBF16GER2PP,
> 	UNSPEC_MMA_PMXVF16GER2, UNSPEC_MMA_PMXVF16GER2NN,
> 	UNSPEC_MMA_PMXVF16GER2NP, UNSPEC_MMA_PMXVF16GER2PN,
> 	UNSPEC_MMA_PMXVF16GER2PP, UNSPEC_MMA_PMXVF32GER,
> 	UNSPEC_MMA_PMXVF32GERNN, UNSPEC_MMA_PMXVF32GERNP,
> 	UNSPEC_MMA_PMXVF32GERPN, UNSPEC_MMA_PMXVF32GERPP,
> 	UNSPEC_MMA_PMXVF64GER, UNSPEC_MMA_PMXVF64GERNN,
> 	UNSPEC_MMA_PMXVF64GERNP, UNSPEC_MMA_PMXVF64GERPN,
> 	UNSPEC_MMA_PMXVF64GERPP, UNSPEC_MMA_PMXVI16GER2,
> 	UNSPEC_MMA_PMXVI16GER2PP, UNSPEC_MMA_PMXVI16GER2S,
> 	UNSPEC_MMA_PMXVI16GER2SPP, UNSPEC_MMA_PMXVI4GER8,
> 	UNSPEC_MMA_PMXVI4GER8PP, UNSPEC_MMA_PMXVI8GER4,
> 	UNSPEC_MMA_PMXVI8GER4PP, UNSPEC_MMA_PMXVI8GER4SPP,
> 	UNSPEC_MMA_XVBF16GER2, UNSPEC_MMA_XVBF16GER2NN,
> 	UNSPEC_MMA_XVBF16GER2NP, UNSPEC_MMA_XVBF16GER2PN,
> 	UNSPEC_MMA_XVBF16GER2PP, UNSPEC_MMA_XVF16GER2,
> UNSPEC_MMA_XVF16GER2NN,
> 	UNSPEC_MMA_XVF16GER2NP, UNSPEC_MMA_XVF16GER2PN,
> UNSPEC_MMA_XVF16GER2PP,
> 	UNSPEC_MMA_XVF32GER, UNSPEC_MMA_XVF32GERNN,
> UNSPEC_MMA_XVF32GERNP,
> 	UNSPEC_MMA_XVF32GERPN, UNSPEC_MMA_XVF32GERPP,
> UNSPEC_MMA_XVF64GER,
> 	UNSPEC_MMA_XVF64GERNN, UNSPEC_MMA_XVF64GERNP,
> UNSPEC_MMA_XVF64GERPN,
> 	UNSPEC_MMA_XVF64GERPP, UNSPEC_MMA_XVI16GER2,
> UNSPEC_MMA_XVI16GER2PP,
> 	UNSPEC_MMA_XVI16GER2S, UNSPEC_MMA_XVI16GER2SPP,
> UNSPEC_MMA_XVI4GER8,
> 	UNSPEC_MMA_XVI4GER8PP, UNSPEC_MMA_XVI8GER4,
> UNSPEC_MMA_XVI8GER4PP,
> 	UNSPEC_MMA_XVI8GER4SPP, UNSPEC_MMA_XXMFACC,
> UNSPEC_MMA_XXMTACC): New.

ok

> 	(MMA_ACC, MMA_VV, MMA_AVV, MMA_PV, MMA_APV, MMA_VVI4I4I8,
> 	MMA_AVVI4I4I8, MMA_VVI4I4I2, MMA_AVVI4I4I2, MMA_VVI4I4,
> 	MMA_AVVI4I4, MMA_PVI4I2, MMA_APVI4I2, MMA_VVI4I4I4,
> 	MMA_AVVI4I4I4): New define_int_iterator.
> 	(acc, vv, avv, pv, apv, vvi4i4i8, avvi4i4i8, vvi4i4i2,
> 	avvi4i4i2, vvi4i4, avvi4i4, pvi4i2, apvi4i2, vvi4i4i4,
> 	avvi4i4i4): New define_int_attr.
> 	(*movpxi): Add zero constant alternative.
> 	(mma_assemble_pair, mma_assemble_acc): New define_expand.
> 	(*mma_assemble_acc): New define_insn_and_split.
> 	(mma_<acc>, mma_xxsetaccz, mma_<vv>, mma_<avv>, mma_<pv>,
> mma_<apv>,
> 	mma_<vvi4i4i8>, mma_<avvi4i4i8>, mma_<vvi4i4i2>,
> mma_<avvi4i4i2>,
> 	mma_<vvi4i4>, mma_<avvi4i4>, mma_<pvi4i2>, mma_<apvi4i2>,
> 	mma_<vvi4i4i4>, mma_<avvi4i4i4>): New define_insn.
> 	* config/rs6000/rs6000.md ('type' attribute): Add mma type.

(mma) : New 'type' attribute.




> 	* config/rs6000/vsx.md (UNSPEC_VSX_XVCVBF16SP): New.
> 	(UNSPEC_VSX_XVCVSPBF16): Likewise.
> 	(XVCVBF16): New define_int_iterator.
> 	(xvcvbf16): New define_int_attr.
> 	(vsx_<xvcvbf16>): New define_insn.
> 	* doc/extend.texi: Document the mma built-ins.
> 



I've read through the rest of this patch.  nothing else jumps out at
me. 

Thanks,
-Will

<snip>





More information about the Gcc-patches mailing list