From 672c4b0e0deb004c11849481e81bd36434c9e45f Mon Sep 17 00:00:00 2001 From: Michael Meissner Date: Tue, 19 Mar 2024 01:11:52 -0400 Subject: [PATCH] Update ChangeLog.* --- gcc/ChangeLog.dmf | 308 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 307 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf index 1599736218a7..5a28e3e994b6 100644 --- a/gcc/ChangeLog.dmf +++ b/gcc/ChangeLog.dmf @@ -1,6 +1,312 @@ +==================== Branch work163-dmf, patch #106 ==================== + +PowerPC: Add support for 1,024 bit DMR registers. + +This patch is a prelimianry patch to add the full 1,024 bit dense math register +(DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the +DMR register. + +This patch only adds the new 1,024 bit register support. It does not add +support for any instructions that need 1,024 bit registers instead of 512 bit +registers. + +I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit +registers. The 'wD' constraint added in previous patches is used for these +registers. I added support to do load and store of DMRs via the VSX registers, +since there are no load/store dense math instructions. I added the new keyword +'__dmr' to create 1,024 bit types that can be loaded into DMRs. At present, I +don't have aliases for __dmr512 and __dmr1024 that we've discussed internally. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2024-03-19 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec. + (UNSPEC_DM_INSERT512_LOWER): Likewise. + (UNSPEC_DM_EXTRACT512): Likewise. + (UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise. + (UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise. + (movtdo): New define_expand and define_insn_and_split to implement 1,024 + bit DMR registers. + (movtdo_insert512_upper): New insn. + (movtdo_insert512_lower): Likewise. + (movtdo_extract512): Likewise. + (reload_dmr_from_memory): Likewise. + (reload_dmr_to_memory): Likewise. + * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR + support. + (rs6000_init_builtins): Add support for __dmr keyword. + * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support + for TDOmode. + (rs6000_function_arg): Likewise. + * config/rs6000/rs6000-modes.def (TDOmode): New mode. + * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add + support for TDOmode. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_hard_regno_mode_ok): Likewise. + (rs6000_modes_tieable_p): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload + hooks for DMR mode. + (reg_offset_addressing_ok_p): Add support for TDOmode. + (rs6000_emit_move): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (rs6000_mangle_type): Add mangling for __dmr type. + (rs6000_dmr_register_move_cost): Add support for TDOmode. + (rs6000_split_multireg_move): Likewise. + (rs6000_invalid_conversion): Likewise. + * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode. + (enum rs6000_builtin_type_index): Add DMR type nodes. + (dmr_type_node): Likewise. + (ptr_dmr_type_node): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/dm-1024bit.c: New test. + +==================== Branch work163-dmf, patch #105 ==================== + +Add dense math test for new instruction names. + +2024-03-19 Michael Meissner + +gcc/testsuite/ + + * gcc.target/powerpc/dm-double-test.c: New test. + * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New + target test. + +==================== Branch work163-dmf, patch #104 ==================== + +PowerPC: Switch to dense math names for all MMA operations. + +This patch changes the assembler instruction names for MMA instructions from +the original name used in power10 to the new name when used with the dense math +system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the +same bits for either spelling. + +For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the +instruction. However, the prefixed instructions have a 'pm' prefix, and we add +the 'dm' prefix afterwards. To prevent having two sets of parallel int +attributes, we remove the "pm" prefix from the instruction string in the +attributes, and add it later, both in the insn name and in the output template. + +2024-03-19 Michael Meissner + +gcc/ + + * config/rs6000/mma.md (vvi4i4i8): Change the instruction to not have a + "pm" prefix. + (avvi4i4i8): Likewise. + (vvi4i4i2): Likewise. + (avvi4i4i2): Likewise. + (vvi4i4): Likewise. + (avvi4i4): Likewise. + (pvi4i2): Likewise. + (apvi4i2): Likewise. + (vvi4i4i4): Likewise. + (avvi4i4i4): Likewise. + (mma_xxsetaccz): Add support for running on DMF systems, generating the + dense math instruction and using the dense math accumulators. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_pm): Add support for running on DMF systems, generating + the dense math instruction and using the dense math accumulators. + Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm' + prefixes based on whether we have the original MMA specification or if + we have dense math support. + (mma_pm): Likewise. + (mma_pm): Likewise. + (mma_pm): Likewise. + (mma_pm): Likewise. + (mma_pm): Likewise. + (mma_pm): Likewise. + (mma_pm): Likewise. + +==================== Branch work163-dmf, patch #103 ==================== + +Add support for dense math registers. + +The MMA subsystem added the notion of accumulator registers as an optional +feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with +the VSX registers 0..31, but logically the accumulator registers were separate +from the FPR registers. In ISA 3.1, it was anticipated that in future systems, +the accumulator registers may no overlap with the FPR registers. This patch +adds the support for dense math registers as separate registers. + +This particular patch does not change the MMA support to use the accumulators +within the dense math registers. This patch just adds the basic support for +having separate DMRs. The next patch will switch the MMA support to use the +accumulators if -mcpu=future is used. + +For testing purposes, I added an undocumented option '-mdense-math' to enable +or disable the dense math support. + +This patch adds a new constraint (wD). If MMA is selected but dense math is +not selected (i.e. -mcpu=power10), the wD constraint will allow access to +accumulators that overlap with VSX registers 0..31. If both MMA and dense math +are selected (i.e. -mcpu=future), the wD constraint will only allow dense math +registers. + +This patch modifies the existing %A output modifier. If MMA is selected but +dense math is not selected, then %A output modifier converts the VSX register +number to the accumulator number, by dividing it by 4. If both MMA and dense +math are selected, then %A will map the separate DMR registers into 0..7. + +The intention is that user code using extended asm can be modified to run on +both MMA without dense math and MMA with dense math: + + 1) If possible, don't use extended asm, but instead use the MMA built-in + functions; + + 2) If you do need to write extended asm, change the d constraints + targetting accumulators should now use wD; + + 3) Only use the built-in zero, assemble and disassemble functions create + move data between vector quad types and dense math accumulators. + I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the + extended asm code. The reason is these instructions assume there is a + 1-to-1 correspondence between 4 adjacent FPR registers and an + accumulator that overlaps with those instructions. With accumulators + now being separate registers, there no longer is a 1-to-1 + correspondence. + +It is possible that the mangling for DMRs and the GDB register numbers may +produce other changes in the future. + +2024-03-19 Michael Meissner + + * config/rs6000/mma.md (movxo): Add comments about dense math registers. + (movxo_nodm): Rename from movxo and restrict the usage to machines + without dense math registers. + (movxo_dm): New insn for movxo support for machines with dense math + registers. + (mma_): Restrict usage to machines without dense math registers. + (mma_xxsetaccz): Make a define_expand, and add support for dense math + registers. + (mma_xxsetaccz_nodm): Rename from mma_xxsetaccz, and restrict to + machines without dense math registers. + (mma_dmsetaccz): New insn. + * config/rs6000/predicates.md (dmr_operand): New predicate. + (accumulator_operand): Add support for dense math registers. + * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do + not de-prime accumulator when disassembling a vector quad. + * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE. + (enum rs6000_reload_reg_type): Add RELOAD_REG_DMR. + (LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD + constraint. + (reload_reg_map): Likewise. + (rs6000_reg_names): Likewise. + (alt_reg_names): Likewise. + (rs6000_hard_regno_nregs_internal): Likewise. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_secondary_reload_memory): Add support for DMR registers. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (print_operand): Make %A handle both FPRs and DMRs. + (rs6000_dmr_register_move_cost): New helper function. + (rs6000_register_move_cost): Add support for DMR registers. + (rs6000_memory_move_cost): Likewise. + (rs6000_compute_pressure_classes): Likewise. + (rs6000_debugger_regno): Likewise. + (rs6000_split_multireg_move): Add support for DMRs. + * config/rs6000/rs6000.h (TARGET_DENSE_MATH): New macro. + (TARGET_MMA_DENSE_MATH): Likewise. + (TARGET_MMA_NO_DENSE_MATH): Likewise + (UNITS_PER_DMR_WORD): Likewise. + (FIRST_PSEUDO_REGISTER): Update for DMRs. + (FIXED_REGISTERS): Add DMRs. + (CALL_REALLY_USED_REGISTERS): Likewise. + (REG_ALLOC_ORDER): Likewise. + (DMR_REGNO_P): New macro. + (enum reg_class): Add DM_REGS. + (REG_CLASS_NAMES): Likewise. + (REG_CLASS_CONTENTS): Likewise. + (enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD. + (REGISTER_NAMES): Add DMR registers. + (ADDITIONAL_REGISTER_NAMES): Likewise. + +==================== Branch work163-dmf, patch #102 ==================== + +Add wD constraint. + +This patch adds a new constraint ('wD') that matches the accumulator registers +that overlap with VSX registers 0..31 on power10. Future patches will add the +support for a separate accumulator register class that will be used when the +support for dense math registes is added. + +2024-03-19 Michael Meissner + + * config/rs6000/constraints.md (wD): New constraint. + * config/rs6000/mma.md (mma_disassemble_acc): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_): Likewise. + (mma_ + +gcc/ + + * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using + load vector pair and store vector pair instructions for memory copy + operations. + (POWERPC_MASKS): Make the bit for enabling using load vector pair and + store vector pair operations set and reset when the PowerPC processor is + changed. + ==================== Branch work163-dmf, baseline ==================== +Add ChangeLog.dmf and update REVISION. + +2024-03-18 Michael Meissner + +gcc/ + + * ChangeLog.dmf: New file for branch. + * REVISION: Update. + 2024-03-18 Michael Meissner Clone branch - -- 2.43.5