]> gcc.gnu.org Git - gcc.git/blame - gcc/ChangeLog.dmf
Update ChangeLog.*
[gcc.git] / gcc / ChangeLog.dmf
CommitLineData
672c4b0e
MM
1==================== Branch work163-dmf, patch #106 ====================
2
3PowerPC: Add support for 1,024 bit DMR registers.
4
5This patch is a prelimianry patch to add the full 1,024 bit dense math register
6(DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the
7DMR register.
8
9This patch only adds the new 1,024 bit register support. It does not add
10support for any instructions that need 1,024 bit registers instead of 512 bit
11registers.
12
13I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
14registers. The 'wD' constraint added in previous patches is used for these
15registers. I added support to do load and store of DMRs via the VSX registers,
16since there are no load/store dense math instructions. I added the new keyword
17'__dmr' to create 1,024 bit types that can be loaded into DMRs. At present, I
18don't have aliases for __dmr512 and __dmr1024 that we've discussed internally.
19
20The patches have been tested on both little and big endian systems. Can I check
21it into the master branch?
22
232024-03-19 Michael Meissner <meissner@linux.ibm.com>
24
25gcc/
26
27 * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
28 (UNSPEC_DM_INSERT512_LOWER): Likewise.
29 (UNSPEC_DM_EXTRACT512): Likewise.
30 (UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
31 (UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
32 (movtdo): New define_expand and define_insn_and_split to implement 1,024
33 bit DMR registers.
34 (movtdo_insert512_upper): New insn.
35 (movtdo_insert512_lower): Likewise.
36 (movtdo_extract512): Likewise.
37 (reload_dmr_from_memory): Likewise.
38 (reload_dmr_to_memory): Likewise.
39 * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
40 support.
41 (rs6000_init_builtins): Add support for __dmr keyword.
42 * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
43 for TDOmode.
44 (rs6000_function_arg): Likewise.
45 * config/rs6000/rs6000-modes.def (TDOmode): New mode.
46 * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
47 support for TDOmode.
48 (rs6000_hard_regno_mode_ok_uncached): Likewise.
49 (rs6000_hard_regno_mode_ok): Likewise.
50 (rs6000_modes_tieable_p): Likewise.
51 (rs6000_debug_reg_global): Likewise.
52 (rs6000_setup_reg_addr_masks): Likewise.
53 (rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload
54 hooks for DMR mode.
55 (reg_offset_addressing_ok_p): Add support for TDOmode.
56 (rs6000_emit_move): Likewise.
57 (rs6000_secondary_reload_simple_move): Likewise.
58 (rs6000_preferred_reload_class): Likewise.
59 (rs6000_secondary_reload_class): Likewise.
60 (rs6000_mangle_type): Add mangling for __dmr type.
61 (rs6000_dmr_register_move_cost): Add support for TDOmode.
62 (rs6000_split_multireg_move): Likewise.
63 (rs6000_invalid_conversion): Likewise.
64 * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
65 (enum rs6000_builtin_type_index): Add DMR type nodes.
66 (dmr_type_node): Likewise.
67 (ptr_dmr_type_node): Likewise.
68
69gcc/testsuite/
70
71 * gcc.target/powerpc/dm-1024bit.c: New test.
72
73==================== Branch work163-dmf, patch #105 ====================
74
75Add dense math test for new instruction names.
76
772024-03-19 Michael Meissner <meissner@linux.ibm.com>
78
79gcc/testsuite/
80
81 * gcc.target/powerpc/dm-double-test.c: New test.
82 * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
83 target test.
84
85==================== Branch work163-dmf, patch #104 ====================
86
87PowerPC: Switch to dense math names for all MMA operations.
88
89This patch changes the assembler instruction names for MMA instructions from
90the original name used in power10 to the new name when used with the dense math
91system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the
92same bits for either spelling.
93
94For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
95instruction. However, the prefixed instructions have a 'pm' prefix, and we add
96the 'dm' prefix afterwards. To prevent having two sets of parallel int
97attributes, we remove the "pm" prefix from the instruction string in the
98attributes, and add it later, both in the insn name and in the output template.
99
1002024-03-19 Michael Meissner <meissner@linux.ibm.com>
101
102gcc/
103
104 * config/rs6000/mma.md (vvi4i4i8): Change the instruction to not have a
105 "pm" prefix.
106 (avvi4i4i8): Likewise.
107 (vvi4i4i2): Likewise.
108 (avvi4i4i2): Likewise.
109 (vvi4i4): Likewise.
110 (avvi4i4): Likewise.
111 (pvi4i2): Likewise.
112 (apvi4i2): Likewise.
113 (vvi4i4i4): Likewise.
114 (avvi4i4i4): Likewise.
115 (mma_xxsetaccz): Add support for running on DMF systems, generating the
116 dense math instruction and using the dense math accumulators.
117 (mma_<vv>): Likewise.
118 (mma_<pv>): Likewise.
119 (mma_<avv>): Likewise.
120 (mma_<apv>): Likewise.
121 (mma_pm<vvi4i4i8>): Add support for running on DMF systems, generating
122 the dense math instruction and using the dense math accumulators.
123 Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
124 prefixes based on whether we have the original MMA specification or if
125 we have dense math support.
126 (mma_pm<avvi4i4i8>): Likewise.
127 (mma_pm<vvi4i4i2>): Likewise.
128 (mma_pm<avvi4i4i2>): Likewise.
129 (mma_pm<vvi4i4>): Likewise.
130 (mma_pm<avvi4i4): Likewise.
131 (mma_pm<pvi4i2>): Likewise.
132 (mma_pm<apvi4i2): Likewise.
133 (mma_pm<vvi4i4i4>): Likewise.
134 (mma_pm<avvi4i4i4>): Likewise.
135
136==================== Branch work163-dmf, patch #103 ====================
137
138Add support for dense math registers.
139
140The MMA subsystem added the notion of accumulator registers as an optional
141feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with
142the VSX registers 0..31, but logically the accumulator registers were separate
143from the FPR registers. In ISA 3.1, it was anticipated that in future systems,
144the accumulator registers may no overlap with the FPR registers. This patch
145adds the support for dense math registers as separate registers.
146
147This particular patch does not change the MMA support to use the accumulators
148within the dense math registers. This patch just adds the basic support for
149having separate DMRs. The next patch will switch the MMA support to use the
150accumulators if -mcpu=future is used.
151
152For testing purposes, I added an undocumented option '-mdense-math' to enable
153or disable the dense math support.
154
155This patch adds a new constraint (wD). If MMA is selected but dense math is
156not selected (i.e. -mcpu=power10), the wD constraint will allow access to
157accumulators that overlap with VSX registers 0..31. If both MMA and dense math
158are selected (i.e. -mcpu=future), the wD constraint will only allow dense math
159registers.
160
161This patch modifies the existing %A output modifier. If MMA is selected but
162dense math is not selected, then %A output modifier converts the VSX register
163number to the accumulator number, by dividing it by 4. If both MMA and dense
164math are selected, then %A will map the separate DMR registers into 0..7.
165
166The intention is that user code using extended asm can be modified to run on
167both MMA without dense math and MMA with dense math:
168
169 1) If possible, don't use extended asm, but instead use the MMA built-in
170 functions;
171
172 2) If you do need to write extended asm, change the d constraints
173 targetting accumulators should now use wD;
174
175 3) Only use the built-in zero, assemble and disassemble functions create
176 move data between vector quad types and dense math accumulators.
177 I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
178 extended asm code. The reason is these instructions assume there is a
179 1-to-1 correspondence between 4 adjacent FPR registers and an
180 accumulator that overlaps with those instructions. With accumulators
181 now being separate registers, there no longer is a 1-to-1
182 correspondence.
183
184It is possible that the mangling for DMRs and the GDB register numbers may
185produce other changes in the future.
186
1872024-03-19 Michael Meissner <meissner@linux.ibm.com>
188
189 * config/rs6000/mma.md (movxo): Add comments about dense math registers.
190 (movxo_nodm): Rename from movxo and restrict the usage to machines
191 without dense math registers.
192 (movxo_dm): New insn for movxo support for machines with dense math
193 registers.
194 (mma_<acc>): Restrict usage to machines without dense math registers.
195 (mma_xxsetaccz): Make a define_expand, and add support for dense math
196 registers.
197 (mma_xxsetaccz_nodm): Rename from mma_xxsetaccz, and restrict to
198 machines without dense math registers.
199 (mma_dmsetaccz): New insn.
200 * config/rs6000/predicates.md (dmr_operand): New predicate.
201 (accumulator_operand): Add support for dense math registers.
202 * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do
203 not de-prime accumulator when disassembling a vector quad.
204 * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
205 (enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
206 (LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
207 constraint.
208 (reload_reg_map): Likewise.
209 (rs6000_reg_names): Likewise.
210 (alt_reg_names): Likewise.
211 (rs6000_hard_regno_nregs_internal): Likewise.
212 (rs6000_hard_regno_mode_ok_uncached): Likewise.
213 (rs6000_debug_reg_global): Likewise.
214 (rs6000_setup_reg_addr_masks): Likewise.
215 (rs6000_init_hard_regno_mode_ok): Likewise.
216 (rs6000_secondary_reload_memory): Add support for DMR registers.
217 (rs6000_secondary_reload_simple_move): Likewise.
218 (rs6000_preferred_reload_class): Likewise.
219 (rs6000_secondary_reload_class): Likewise.
220 (print_operand): Make %A handle both FPRs and DMRs.
221 (rs6000_dmr_register_move_cost): New helper function.
222 (rs6000_register_move_cost): Add support for DMR registers.
223 (rs6000_memory_move_cost): Likewise.
224 (rs6000_compute_pressure_classes): Likewise.
225 (rs6000_debugger_regno): Likewise.
226 (rs6000_split_multireg_move): Add support for DMRs.
227 * config/rs6000/rs6000.h (TARGET_DENSE_MATH): New macro.
228 (TARGET_MMA_DENSE_MATH): Likewise.
229 (TARGET_MMA_NO_DENSE_MATH): Likewise
230 (UNITS_PER_DMR_WORD): Likewise.
231 (FIRST_PSEUDO_REGISTER): Update for DMRs.
232 (FIXED_REGISTERS): Add DMRs.
233 (CALL_REALLY_USED_REGISTERS): Likewise.
234 (REG_ALLOC_ORDER): Likewise.
235 (DMR_REGNO_P): New macro.
236 (enum reg_class): Add DM_REGS.
237 (REG_CLASS_NAMES): Likewise.
238 (REG_CLASS_CONTENTS): Likewise.
239 (enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD.
240 (REGISTER_NAMES): Add DMR registers.
241 (ADDITIONAL_REGISTER_NAMES): Likewise.
242
243==================== Branch work163-dmf, patch #102 ====================
244
245Add wD constraint.
246
247This patch adds a new constraint ('wD') that matches the accumulator registers
248that overlap with VSX registers 0..31 on power10. Future patches will add the
249support for a separate accumulator register class that will be used when the
250support for dense math registes is added.
251
2522024-03-19 Michael Meissner <meissner@linux.ibm.com>
253
254 * config/rs6000/constraints.md (wD): New constraint.
255 * config/rs6000/mma.md (mma_disassemble_acc): Likewise.
256 (mma_<vv>): Likewise.
257 (mma_<avv>): Likewise.
258 (mma_<pv>): Likewise.
259 (mma_<apv>): Likewise.
260 (mma_<vvi4i4i8>): Likewise.
261 (mma_<avvi4i4i8>): Likewise.
262 (mma_<vvi4i4i2>): Likewise.
263 (mma_<avvi4i4i2>): Likewise.
264 (mma_<vvi4i4>): Likewise.
265 (mma_<avvi4i4>): Likewise.
266 (mma_<pvi4i2): Likewise.
267 (mma_<apvi4i2>): Likewise.
268 (mma_<vvi4i4i4>): Likewise.
269 (mma_<avvi4i4i4): Likewise.
270 * config/rs6000/predicates.md (accumulator_operand): New predicate.
271 * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register
272 class for the 'wD' constraint.
273 (rs6000_init_hard_regno_mode_ok): Set the 'wD' register constraint
274 class.
275 * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for
276 the 'wD' constraint.
277 * doc/md.texi (PowerPC constraints): Document the 'wD' constraint.
278
279==================== Branch work163-dmf, patch #101 ====================
280
281Use vector pair load/store for memcpy with -mcpu=future
282
283In the development for the power10 processor, GCC did not enable using the load
284vector pair and store vector pair instructions when optimizing things like
285memory copy. This patch enables using those instructions if -mcpu=future is
286used.
287
2882024-03-18 Michael Meissner <meissner@linux.ibm.com>
289
290gcc/
291
292 * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using
293 load vector pair and store vector pair instructions for memory copy
294 operations.
295 (POWERPC_MASKS): Make the bit for enabling using load vector pair and
296 store vector pair operations set and reset when the PowerPC processor is
297 changed.
298
a275a33d
MM
299==================== Branch work163-dmf, baseline ====================
300
672c4b0e
MM
301Add ChangeLog.dmf and update REVISION.
302
3032024-03-18 Michael Meissner <meissner@linux.ibm.com>
304
305gcc/
306
307 * ChangeLog.dmf: New file for branch.
308 * REVISION: Update.
309
a275a33d
MM
3102024-03-18 Michael Meissner <meissner@linux.ibm.com>
311
312 Clone branch
This page took 0.059504 seconds and 5 git commands to generate.