[gcc.git] / gcc / ChangeLog.dmf

==================== Branch work163-dmf, patch #106 ====================

PowerPC: Add support for 1,024 bit DMR registers.

This patch is a prelimianry patch to add the full 1,024 bit dense math register
(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
DMR register.

This patch only adds the new 1,024 bit register support.  It does not add
support for any instructions that need 1,024 bit registers instead of 512 bit
registers.

I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
registers.  The 'wD' constraint added in previous patches is used for these
registers.  I added support to do load and store of DMRs via the VSX registers,
since there are no load/store dense math instructions.  I added the new keyword
'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At present, I
don't have aliases for __dmr512 and __dmr1024 that we've discussed internally.

The patches have been tested on both little and big endian systems.  Can I check
it into the master branch?

2024-03-19   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
	(UNSPEC_DM_INSERT512_LOWER): Likewise.
	(UNSPEC_DM_EXTRACT512): Likewise.
	(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
	(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
	(movtdo): New define_expand and define_insn_and_split to implement 1,024
	bit DMR registers.
	(movtdo_insert512_upper): New insn.
	(movtdo_insert512_lower): Likewise.
	(movtdo_extract512): Likewise.
	(reload_dmr_from_memory): Likewise.
	(reload_dmr_to_memory): Likewise.
	* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
	support.
	(rs6000_init_builtins): Add support for __dmr keyword.
	* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
	for TDOmode.
	(rs6000_function_arg): Likewise.
	* config/rs6000/rs6000-modes.def (TDOmode): New mode.
	* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
	support for TDOmode.
	(rs6000_hard_regno_mode_ok_uncached): Likewise.
	(rs6000_hard_regno_mode_ok): Likewise.
	(rs6000_modes_tieable_p): Likewise.
	(rs6000_debug_reg_global): Likewise.
	(rs6000_setup_reg_addr_masks): Likewise.
	(rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup reload
	hooks for DMR mode.
	(reg_offset_addressing_ok_p): Add support for TDOmode.
	(rs6000_emit_move): Likewise.
	(rs6000_secondary_reload_simple_move): Likewise.
	(rs6000_preferred_reload_class): Likewise.
	(rs6000_secondary_reload_class): Likewise.
	(rs6000_mangle_type): Add mangling for __dmr type.
	(rs6000_dmr_register_move_cost): Add support for TDOmode.
	(rs6000_split_multireg_move): Likewise.
	(rs6000_invalid_conversion): Likewise.
	* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
	(enum rs6000_builtin_type_index): Add DMR type nodes.
	(dmr_type_node): Likewise.
	(ptr_dmr_type_node): Likewise.

gcc/testsuite/

	* gcc.target/powerpc/dm-1024bit.c: New test.

==================== Branch work163-dmf, patch #105 ====================

Add dense math test for new instruction names.

2024-03-19   Michael Meissner  <meissner@linux.ibm.com>

gcc/testsuite/

	* gcc.target/powerpc/dm-double-test.c: New test.
	* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
	target test.

==================== Branch work163-dmf, patch #104 ====================

PowerPC: Switch to dense math names for all MMA operations.

This patch changes the assembler instruction names for MMA instructions from
the original name used in power10 to the new name when used with the dense math
system.  I.e. xvf64gerpp becomes dmxvf64gerpp.  The assembler will emit the
same bits for either spelling.

For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
instruction.  However, the prefixed instructions have a 'pm' prefix, and we add
the 'dm' prefix afterwards.  To prevent having two sets of parallel int
attributes, we remove the "pm" prefix from the instruction string in the
attributes, and add it later, both in the insn name and in the output template.

2024-03-19   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/mma.md (vvi4i4i8): Change the instruction to not have a
	"pm" prefix.
	(avvi4i4i8): Likewise.
	(vvi4i4i2): Likewise.
	(avvi4i4i2): Likewise.
	(vvi4i4): Likewise.
	(avvi4i4): Likewise.
	(pvi4i2): Likewise.
	(apvi4i2): Likewise.
	(vvi4i4i4): Likewise.
	(avvi4i4i4): Likewise.
	(mma_xxsetaccz): Add support for running on DMF systems, generating the
	dense math instruction and using the dense math accumulators.
	(mma_<vv>): Likewise.
	(mma_<pv>): Likewise.
	(mma_<avv>): Likewise.
	(mma_<apv>): Likewise.
	(mma_pm<vvi4i4i8>): Add support for running on DMF systems, generating
	the dense math instruction and using the dense math accumulators.
	Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
	prefixes based on whether we have the original MMA specification or if
	we have dense math support.
	(mma_pm<avvi4i4i8>): Likewise.
	(mma_pm<vvi4i4i2>): Likewise.
	(mma_pm<avvi4i4i2>): Likewise.
	(mma_pm<vvi4i4>): Likewise.
	(mma_pm<avvi4i4): Likewise.
	(mma_pm<pvi4i2>): Likewise.
	(mma_pm<apvi4i2): Likewise.
	(mma_pm<vvi4i4i4>): Likewise.
	(mma_pm<avvi4i4i4>): Likewise.

==================== Branch work163-dmf, patch #103 ====================

Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped with
the VSX registers 0..31, but logically the accumulator registers were separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to enable
or disable the dense math support.

This patch adds a new constraint (wD).  If MMA is selected but dense math is
not selected (i.e. -mcpu=power10), the wD constraint will allow access to
accumulators that overlap with VSX registers 0..31.  If both MMA and dense math
are selected (i.e. -mcpu=future), the wD constraint will only allow dense math
registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX register
number to the accumulator number, by dividing it by 4.  If both MMA and dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

    1)	If possible, don't use extended asm, but instead use the MMA built-in
	functions;

    2)	If you do need to write extended asm, change the d constraints
	targetting accumulators should now use wD;

    3)	Only use the built-in zero, assemble and disassemble functions create
	move data between vector quad types and dense math accumulators.
	I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
	extended asm code.  The reason is these instructions assume there is a
	1-to-1 correspondence between 4 adjacent FPR registers and an
	accumulator that overlaps with those instructions.  With accumulators
	now being separate registers, there no longer is a 1-to-1
	correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

2024-03-19   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/mma.md (movxo): Add comments about dense math registers.
	(movxo_nodm): Rename from movxo and restrict the usage to machines
	without dense math registers.
	(movxo_dm): New insn for movxo support for machines with dense math
	registers.
	(mma_<acc>): Restrict usage to machines without dense math registers.
	(mma_xxsetaccz): Make a define_expand, and add support for dense math
	registers.
	(mma_xxsetaccz_nodm): Rename from mma_xxsetaccz, and restrict to
	machines without dense math registers.
	(mma_dmsetaccz): New insn.
	* config/rs6000/predicates.md (dmr_operand): New predicate.
	(accumulator_operand): Add support for dense math registers.
	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do
	not de-prime accumulator when disassembling a vector quad.
	* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
	(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
	(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
	constraint.
	(reload_reg_map): Likewise.
	(rs6000_reg_names): Likewise.
	(alt_reg_names): Likewise.
	(rs6000_hard_regno_nregs_internal): Likewise.
	(rs6000_hard_regno_mode_ok_uncached): Likewise.
	(rs6000_debug_reg_global): Likewise.
	(rs6000_setup_reg_addr_masks): Likewise.
	(rs6000_init_hard_regno_mode_ok): Likewise.
	(rs6000_secondary_reload_memory): Add support for DMR registers.
	(rs6000_secondary_reload_simple_move): Likewise.
	(rs6000_preferred_reload_class): Likewise.
	(rs6000_secondary_reload_class): Likewise.
	(print_operand): Make %A handle both FPRs and DMRs.
	(rs6000_dmr_register_move_cost): New helper function.
	(rs6000_register_move_cost): Add support for DMR registers.
	(rs6000_memory_move_cost): Likewise.
	(rs6000_compute_pressure_classes): Likewise.
	(rs6000_debugger_regno): Likewise.
	(rs6000_split_multireg_move): Add support for DMRs.
	* config/rs6000/rs6000.h (TARGET_DENSE_MATH): New macro.
	(TARGET_MMA_DENSE_MATH): Likewise.
	(TARGET_MMA_NO_DENSE_MATH): Likewise
	(UNITS_PER_DMR_WORD): Likewise.
	(FIRST_PSEUDO_REGISTER): Update for DMRs.
	(FIXED_REGISTERS): Add DMRs.
	(CALL_REALLY_USED_REGISTERS): Likewise.
	(REG_ALLOC_ORDER): Likewise.
	(DMR_REGNO_P): New macro.
	(enum reg_class): Add DM_REGS.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise.
	(enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD.
	(REGISTER_NAMES): Add DMR registers.
	(ADDITIONAL_REGISTER_NAMES): Likewise.

==================== Branch work163-dmf, patch #102 ====================

Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator registers
that overlap with VSX registers 0..31 on power10.  Future patches will add the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2024-03-19   Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/constraints.md (wD): New constraint.
	* config/rs6000/mma.md (mma_disassemble_acc): Likewise.
	(mma_<vv>): Likewise.
	(mma_<avv>): Likewise.
	(mma_<pv>): Likewise.
	(mma_<apv>): Likewise.
	(mma_<vvi4i4i8>): Likewise.
	(mma_<avvi4i4i8>): Likewise.
	(mma_<vvi4i4i2>): Likewise.
	(mma_<avvi4i4i2>): Likewise.
	(mma_<vvi4i4>): Likewise.
	(mma_<avvi4i4>): Likewise.
	(mma_<pvi4i2): Likewise.
	(mma_<apvi4i2>): Likewise.
	(mma_<vvi4i4i4>): Likewise.
	(mma_<avvi4i4i4): Likewise.
	* config/rs6000/predicates.md (accumulator_operand): New predicate.
	* config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register
	class for the 'wD' constraint.
	(rs6000_init_hard_regno_mode_ok): Set the 'wD' register constraint
	class.
	* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for
	the 'wD' constraint.
	* doc/md.texi (PowerPC constraints): Document the 'wD' constraint.

==================== Branch work163-dmf, patch #101 ====================

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2024-03-18  Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using
	load vector pair and store vector pair instructions for memory copy
	operations.
	(POWERPC_MASKS): Make the bit for enabling using load vector pair and
	store vector pair operations set and reset when the PowerPC processor is
	changed.

==================== Branch work163-dmf, baseline ====================

Add ChangeLog.dmf and update REVISION.

2024-03-18  Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* ChangeLog.dmf: New file for branch.
	* REVISION: Update.

2024-03-18   Michael Meissner  <meissner@linux.ibm.com>

	Clone branch
Commit	Line	Data
672c4b0e MM	1	==================== Branch work163-dmf, patch #106 ====================
	2
	3	PowerPC: Add support for 1,024 bit DMR registers.
	4
	5	This patch is a prelimianry patch to add the full 1,024 bit dense math register
	6	(DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the
	7	DMR register.
	8
	9	This patch only adds the new 1,024 bit register support. It does not add
	10	support for any instructions that need 1,024 bit registers instead of 512 bit
	11	registers.
	12
	13	I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
	14	registers. The 'wD' constraint added in previous patches is used for these
	15	registers. I added support to do load and store of DMRs via the VSX registers,
	16	since there are no load/store dense math instructions. I added the new keyword
	17	'__dmr' to create 1,024 bit types that can be loaded into DMRs. At present, I
	18	don't have aliases for __dmr512 and __dmr1024 that we've discussed internally.
	19
	20	The patches have been tested on both little and big endian systems. Can I check
	21	it into the master branch?
	22
	23	2024-03-19 Michael Meissner <meissner@linux.ibm.com>
	24
	25	gcc/
	26
	27	* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
	28	(UNSPEC_DM_INSERT512_LOWER): Likewise.
	29	(UNSPEC_DM_EXTRACT512): Likewise.
	30	(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
	31	(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
	32	(movtdo): New define_expand and define_insn_and_split to implement 1,024
	33	bit DMR registers.
	34	(movtdo_insert512_upper): New insn.
	35	(movtdo_insert512_lower): Likewise.
	36	(movtdo_extract512): Likewise.
	37	(reload_dmr_from_memory): Likewise.
	38	(reload_dmr_to_memory): Likewise.
	39	* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
	40	support.
	41	(rs6000_init_builtins): Add support for __dmr keyword.
	42	* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
	43	for TDOmode.
	44	(rs6000_function_arg): Likewise.
	45	* config/rs6000/rs6000-modes.def (TDOmode): New mode.
	46	* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
	47	support for TDOmode.
	48	(rs6000_hard_regno_mode_ok_uncached): Likewise.
	49	(rs6000_hard_regno_mode_ok): Likewise.
	50	(rs6000_modes_tieable_p): Likewise.
	51	(rs6000_debug_reg_global): Likewise.
	52	(rs6000_setup_reg_addr_masks): Likewise.
	53	(rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload
	54	hooks for DMR mode.
	55	(reg_offset_addressing_ok_p): Add support for TDOmode.
	56	(rs6000_emit_move): Likewise.
	57	(rs6000_secondary_reload_simple_move): Likewise.
	58	(rs6000_preferred_reload_class): Likewise.
	59	(rs6000_secondary_reload_class): Likewise.
	60	(rs6000_mangle_type): Add mangling for __dmr type.
	61	(rs6000_dmr_register_move_cost): Add support for TDOmode.
	62	(rs6000_split_multireg_move): Likewise.
	63	(rs6000_invalid_conversion): Likewise.
	64	* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
65	(enum rs6000_builtin_type_index): Add DMR type nodes.
66	(dmr_type_node): Likewise.
67	(ptr_dmr_type_node): Likewise.
68
69	gcc/testsuite/
70
71	* gcc.target/powerpc/dm-1024bit.c: New test.
72
73	==================== Branch work163-dmf, patch #105 ====================
74
75	Add dense math test for new instruction names.
76
77	2024-03-19 Michael Meissner <meissner@linux.ibm.com>
78
79	gcc/testsuite/
80
81	* gcc.target/powerpc/dm-double-test.c: New test.
82	* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
83	target test.
84
85	==================== Branch work163-dmf, patch #104 ====================
86
87	PowerPC: Switch to dense math names for all MMA operations.
88
89	This patch changes the assembler instruction names for MMA instructions from
90	the original name used in power10 to the new name when used with the dense math
91	system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the
92	same bits for either spelling.
93
94	For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
95	instruction. However, the prefixed instructions have a 'pm' prefix, and we add
96	the 'dm' prefix afterwards. To prevent having two sets of parallel int
97	attributes, we remove the "pm" prefix from the instruction string in the
98	attributes, and add it later, both in the insn name and in the output template.
99
100	2024-03-19 Michael Meissner <meissner@linux.ibm.com>
101
102	gcc/
103
104	* config/rs6000/mma.md (vvi4i4i8): Change the instruction to not have a
105	"pm" prefix.
106	(avvi4i4i8): Likewise.
107	(vvi4i4i2): Likewise.
108	(avvi4i4i2): Likewise.
109	(vvi4i4): Likewise.
110	(avvi4i4): Likewise.
111	(pvi4i2): Likewise.
112	(apvi4i2): Likewise.
113	(vvi4i4i4): Likewise.
114	(avvi4i4i4): Likewise.
115	(mma_xxsetaccz): Add support for running on DMF systems, generating the
116	dense math instruction and using the dense math accumulators.
117	(mma_<vv>): Likewise.
118	(mma_<pv>): Likewise.
119	(mma_<avv>): Likewise.
120	(mma_<apv>): Likewise.
121	(mma_pm<vvi4i4i8>): Add support for running on DMF systems, generating
122	the dense math instruction and using the dense math accumulators.
123	Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
124	prefixes based on whether we have the original MMA specification or if
125	we have dense math support.
126	(mma_pm<avvi4i4i8>): Likewise.
127	(mma_pm<vvi4i4i2>): Likewise.
128	(mma_pm<avvi4i4i2>): Likewise.
129	(mma_pm<vvi4i4>): Likewise.
130	(mma_pm<avvi4i4): Likewise.
131	(mma_pm<pvi4i2>): Likewise.
132	(mma_pm<apvi4i2): Likewise.
133	(mma_pm<vvi4i4i4>): Likewise.
134	(mma_pm<avvi4i4i4>): Likewise.
135
136	==================== Branch work163-dmf, patch #103 ====================
137
138	Add support for dense math registers.
139
140	The MMA subsystem added the notion of accumulator registers as an optional
141	feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with
142	the VSX registers 0..31, but logically the accumulator registers were separate
143	from the FPR registers. In ISA 3.1, it was anticipated that in future systems,
144	the accumulator registers may no overlap with the FPR registers. This patch
145	adds the support for dense math registers as separate registers.
146
147	This particular patch does not change the MMA support to use the accumulators
148	within the dense math registers. This patch just adds the basic support for
149	having separate DMRs. The next patch will switch the MMA support to use the
150	accumulators if -mcpu=future is used.
151
152	For testing purposes, I added an undocumented option '-mdense-math' to enable
153	or disable the dense math support.
154
155	This patch adds a new constraint (wD). If MMA is selected but dense math is
156	not selected (i.e. -mcpu=power10), the wD constraint will allow access to
157	accumulators that overlap with VSX registers 0..31. If both MMA and dense math
158	are selected (i.e. -mcpu=future), the wD constraint will only allow dense math
159	registers.
160
161	This patch modifies the existing %A output modifier. If MMA is selected but
162	dense math is not selected, then %A output modifier converts the VSX register
163	number to the accumulator number, by dividing it by 4. If both MMA and dense
164	math are selected, then %A will map the separate DMR registers into 0..7.
165
166	The intention is that user code using extended asm can be modified to run on
167	both MMA without dense math and MMA with dense math:
168
169	1) If possible, don't use extended asm, but instead use the MMA built-in
170	functions;
171
172	2) If you do need to write extended asm, change the d constraints
173	targetting accumulators should now use wD;
174
175	3) Only use the built-in zero, assemble and disassemble functions create
176	move data between vector quad types and dense math accumulators.
177	I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
178	extended asm code. The reason is these instructions assume there is a
179	1-to-1 correspondence between 4 adjacent FPR registers and an
180	accumulator that overlaps with those instructions. With accumulators
181	now being separate registers, there no longer is a 1-to-1
182	correspondence.
183
184	It is possible that the mangling for DMRs and the GDB register numbers may
185	produce other changes in the future.
186
187	2024-03-19 Michael Meissner <meissner@linux.ibm.com>
188
189	* config/rs6000/mma.md (movxo): Add comments about dense math registers.
190	(movxo_nodm): Rename from movxo and restrict the usage to machines
191	without dense math registers.
192	(movxo_dm): New insn for movxo support for machines with dense math
193	registers.
194	(mma_<acc>): Restrict usage to machines without dense math registers.
195	(mma_xxsetaccz): Make a define_expand, and add support for dense math
196	registers.
197	(mma_xxsetaccz_nodm): Rename from mma_xxsetaccz, and restrict to
198	machines without dense math registers.
199	(mma_dmsetaccz): New insn.
200	* config/rs6000/predicates.md (dmr_operand): New predicate.
201	(accumulator_operand): Add support for dense math registers.
202	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do
203	not de-prime accumulator when disassembling a vector quad.
204	* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
205	(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
206	(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
207	constraint.
208	(reload_reg_map): Likewise.
209	(rs6000_reg_names): Likewise.
210	(alt_reg_names): Likewise.
211	(rs6000_hard_regno_nregs_internal): Likewise.
212	(rs6000_hard_regno_mode_ok_uncached): Likewise.
213	(rs6000_debug_reg_global): Likewise.
214	(rs6000_setup_reg_addr_masks): Likewise.
215	(rs6000_init_hard_regno_mode_ok): Likewise.
216	(rs6000_secondary_reload_memory): Add support for DMR registers.
217	(rs6000_secondary_reload_simple_move): Likewise.
218	(rs6000_preferred_reload_class): Likewise.
219	(rs6000_secondary_reload_class): Likewise.
220	(print_operand): Make %A handle both FPRs and DMRs.
221	(rs6000_dmr_register_move_cost): New helper function.
222	(rs6000_register_move_cost): Add support for DMR registers.
223	(rs6000_memory_move_cost): Likewise.
224	(rs6000_compute_pressure_classes): Likewise.
225	(rs6000_debugger_regno): Likewise.
226	(rs6000_split_multireg_move): Add support for DMRs.
227	* config/rs6000/rs6000.h (TARGET_DENSE_MATH): New macro.
228	(TARGET_MMA_DENSE_MATH): Likewise.
229	(TARGET_MMA_NO_DENSE_MATH): Likewise
230	(UNITS_PER_DMR_WORD): Likewise.
231	(FIRST_PSEUDO_REGISTER): Update for DMRs.
232	(FIXED_REGISTERS): Add DMRs.
233	(CALL_REALLY_USED_REGISTERS): Likewise.
234	(REG_ALLOC_ORDER): Likewise.
235	(DMR_REGNO_P): New macro.
236	(enum reg_class): Add DM_REGS.
237	(REG_CLASS_NAMES): Likewise.
238	(REG_CLASS_CONTENTS): Likewise.
239	(enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD.
240	(REGISTER_NAMES): Add DMR registers.
241	(ADDITIONAL_REGISTER_NAMES): Likewise.
242
243	==================== Branch work163-dmf, patch #102 ====================
244
245	Add wD constraint.
246
247	This patch adds a new constraint ('wD') that matches the accumulator registers
248	that overlap with VSX registers 0..31 on power10. Future patches will add the
249	support for a separate accumulator register class that will be used when the
250	support for dense math registes is added.
251
252	2024-03-19 Michael Meissner <meissner@linux.ibm.com>
253
254	* config/rs6000/constraints.md (wD): New constraint.
255	* config/rs6000/mma.md (mma_disassemble_acc): Likewise.
256	(mma_<vv>): Likewise.
257	(mma_<avv>): Likewise.
258	(mma_<pv>): Likewise.
259	(mma_<apv>): Likewise.
260	(mma_<vvi4i4i8>): Likewise.
261	(mma_<avvi4i4i8>): Likewise.
262	(mma_<vvi4i4i2>): Likewise.
263	(mma_<avvi4i4i2>): Likewise.
264	(mma_<vvi4i4>): Likewise.
265	(mma_<avvi4i4>): Likewise.
266	(mma_<pvi4i2): Likewise.
267	(mma_<apvi4i2>): Likewise.
268	(mma_<vvi4i4i4>): Likewise.
269	(mma_<avvi4i4i4): Likewise.
270	* config/rs6000/predicates.md (accumulator_operand): New predicate.
271	* config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register
272	class for the 'wD' constraint.
273	(rs6000_init_hard_regno_mode_ok): Set the 'wD' register constraint
274	class.
275	* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for
276	the 'wD' constraint.
277	* doc/md.texi (PowerPC constraints): Document the 'wD' constraint.
278
279	==================== Branch work163-dmf, patch #101 ====================
280
281	Use vector pair load/store for memcpy with -mcpu=future
282
283	In the development for the power10 processor, GCC did not enable using the load
284	vector pair and store vector pair instructions when optimizing things like
285	memory copy. This patch enables using those instructions if -mcpu=future is
286	used.
287
288	2024-03-18 Michael Meissner <meissner@linux.ibm.com>
289
290	gcc/
291
292	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using
293	load vector pair and store vector pair instructions for memory copy
294	operations.
295	(POWERPC_MASKS): Make the bit for enabling using load vector pair and
296	store vector pair operations set and reset when the PowerPC processor is
297	changed.
298
a275a33d MM	299	==================== Branch work163-dmf, baseline ====================
a275a33d MM	300
672c4b0e MM	301	Add ChangeLog.dmf and update REVISION.
	302
	303	2024-03-18 Michael Meissner <meissner@linux.ibm.com>
	304
	305	gcc/
	306
	307	* ChangeLog.dmf: New file for branch.
	308	* REVISION: Update.
	309
a275a33d MM	310	2024-03-18 Michael Meissner <meissner@linux.ibm.com>
	311
	312	Clone branch