This patch changes the MMA instructions to use either FPR registers
(-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA
instruction names are used.
A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs.
The patches have been tested on the following platforms. I added the patches
for PR target/107299 that I submitted on November 2nd before doing the builds so
that GCC would build on systems using IEEE 128-bit long double.
* https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604834.html
There were no regressions with doing bootstrap builds and running the regression
tests:
1) Power10 LE using --with-cpu=power10 --with-long-double-format=ieee;
2) Power10 LE using --with-cpu=power10 --with-long-double-format=ibm;
3) Power9 LE using --with-cpu=power9 --with-long-double-format=ibm; and
4) Power8 BE using --with-cpu=power8 (both 32-bit & 64-bit tested).
Can I check this patch into the GCC 13 master branch?
2022-12-02 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/mma.md (mma_<acc>): New define_expand to handle
mma_<acc> for dense math and non dense math.
(mma_<acc> insn): Restrict to non dense math.
(mma_xxsetaccz): Convert to define_expand to handle non dense math and
dense math.
(mma_xxsetaccz_p10): Rename from mma_xxsetaccz and restrict usage to non
dense math.
(mma_xxsetaccz_dm): Dense math version of mma_xxsetaccz.
(mma_<vv>): Add support for dense math.
(mma_<avv>): Likewise.
(mma_<pv>): Likewise.
(mma_<apv>): Likewise.
(mma_<vvi4i4i8>): Likewise.
(mma_<avvi4i4i8>): Likewise.
(mma_<vvi4i4i2>): Likewise.
(mma_<avvi4i4i2>): Likewise.
(mma_<vvi4i4>): Likewise.
(mma_<avvi4i4>): Likewise.
(mma_<pvi4i2>): Likewise.
(mma_<apvi4i2>): Likewise.
(mma_<vvi4i4i4>): Likewise.
(mma_<avvi4i4i4>): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
__PPC_DMR__ if we have dense math instructions.
* config/rs6000/rs6000.cc (print_operand): Make %A handle only DMRs if
dense math and only FPRs if not dense math.
(rs6000_split_multireg_move): Do not generate accumulator prime or
de-prime instructions if dense math.