This patch implements XXSPLTIDP support for SF and DF scalar constants and V2DF
vector constants. The XXSPLTIDP instruction is given a 32-bit immediate that
is converted to a vector of two DFmode constants. The immediate is in SFmode
format, so only constants that fit as SFmode values can be loaded with
XXSPLTIDP.
I added a new constraint (eF) to match constants that can be loaded with the
XXSPLTIDP instruction.
I have added a temporary switch (-mxxspltidp) to control whether or not the
XXSPLTIDP instruction is generated.
I added 3 new tests to test loading up SF/DF scalar and V2DF vector
constants.
I have tested this with bootstrap compilers on power10 systems and there was no
regression. I have built GCC with these patches on little endian power9 and
big endian power8 systems, and there were no regressions.
In addition, I have built and run the full Spec 2017 rate suite, comparing with
the patches enabled and not enabled. There were roughly 66,000 XXSPLTIDP's
generated in the rate build for Spec 2017. On a stand-alone system that is
running single threaded, blender_r has a 1.9% increase in performance, and rest
of the benchmarks are performance neutral. However, I would expect that in a
real world scenario, switching to use XXSPLTIDP will increase performance due
to removing all of the loads.
2021-09-02 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/constraints.md (eF): New constraint.
* config/rs6000/predicates.md (easy_fp_constant): If we can load
the scalar constant with XXSPLTIDP, the floating point constant is
easy.
(xxspltidp_operand): New predicate.
(easy_vector_constant): If we can generate XXSPLTIDP, mark the
vector constant as easy.
* config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
declaration.
(prefixed_permute_p): Likewise.
* config/rs6000/rs6000.c (xxspltidp_constant_p): New function.
(output_vec_const_move): Add support for XXSPLTIDP.
(prefixed_permute_p): New function.
* config/rs6000/rs6000.md (prefixed attribute): Add support for
permute prefixed instructions.
(movsf_hardfloat): Add XXSPLTIDP support.
(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
support.
(vsx_move<mode>_32bit): Likewise.
(vsx_splat_v2df_xxspltidp): New insn.
(XXSPLTIDP): New mode iterator.
(xxspltidp_<mode>_internal): New insn and splits.
(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
iterated form that also does SFmode, and DFmode.
gcc/testsuite/
* gcc.target/powerpc/vec-splat-constant-sf.c: New test.
* gcc.target/powerpc/vec-splat-constant-df.c: New test.
* gcc.target/powerpc/vec-splat-constant-v2df.c: New test.