typedef __UINT64_TYPE__ uint64_t; uint64_t fun64 (const __flash uint64_t *p) { return *p; } runs into an ICE: $ avr-gcc lra-bug.c -S -Os -da -mlra during RTL pass: shorten dump file: lra-bug.c.354r.shorten lra-bug.c: In function 'fun64': lra-bug.c:6:1: internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572 6 | } | ^ The respective line in avr.cc reads: gcc_assert (REG_Z == REGNO (addr)); because the only addressing modes for AS1 __flash are REG and POST_INC of REG_Z (reg:HI 30). However, the insn fed into the function as produced by LRA is like found in lra-bug.c.317r.reload: (insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ]) (mem:QI (reg/f:HI 28 r28 [60]) [1 *p_2(D)+7 S1 A8 AS1])) "lra-bug.c":6:1 86 {movqi_insn_split} (nil)) This insn clearly violates avr.cc's REGNO_MODE_CODE_OK_FOR_BASE_P which only allows REG_Z (regno 30) as register for non-generic address-spaces like AS1. And avr.cc'c MODE_CODE_BASE_REG_CLASS has: if (!ADDR_SPACE_GENERIC_P (as)) { return POINTER_Z_REGS; } but reg:HI 28 in insn 48 is not an element of POINTER_Z_REGS. Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
What I do not understand is when I also set -mlog=legitimate_address_p then I only get logs that have strict=0 and not a single one with strict=1, like: avr_addr_space_legitimate_address_p[fun64:split5(357)]: ret=true, mode=QI strict=0 reload_completed=1 reload_in_progress=0 (reg_renumber): (reg/f:HI 28 r28 [60]) This is for pass .split5 that runy way after reload, and strict=0 doesn't make much sense to me.
See the m68k bug - LRA/IRA _never_ use strict = 1
But postreload still does.
(In reply to Richard Biener from comment #2) > See the m68k bug - LRA/IRA _never_ use strict = 1 You mean PR116236? Its fix says: > This matters on targets like m68k that support index extension > and that require different classes for bases and indices. I still see the ICE on avr with gcc version 15.0.0 20240818 (experimental) (GCC) but PR116236 has been fixes 2024-08-15.
Created attachment 58964 [details] Proof of concept patch The problem seems to be a bad interaction between spilling and register elimination. I think it would only occur on targets that use a real target register as FRAME_POINTER_REGNUM, rather than having a separate FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM. IRA sees: (insn 47 46 48 2 (set (reg:QI 24 r24 [+6 ]) (mem:QI (reg/f:HI 58) [1 *p_2(D)+6 S1 A8 AS1])) "pr116321.c":6:1 86 {movqi_insn_split} (expr_list:REG_DEAD (reg/f:HI 58) (nil))) (insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ]) (mem:QI (reg/f:HI 60) [1 *p_2(D)+7 S1 A8 AS1])) "pr116321.c":6:1 86 {movqi_insn_split} (expr_list:REG_DEAD (reg/f:HI 60) (nil))) where both instructions need an r30 base. As it happens, IRA assigns r30 to r60 . However, when it comes to reloading earlier insns, we have to use r30 for the base even though r60 is live, so we need to spill r60: ********** Assignment #1: ********** Assigning to 91 (cl=POINTER_Z_REGS, orig=58, freq=2000, tfirst=91, tfreq=2000)... Trying 30: spill 60(freq=2000) Now best 30(cost=0, bad_spills=0, insn_pseudos=0) Trying 31: spill 60(freq=2000) Spill r60(hr=30, freq=2000) for r91 Assign 30 to reload r91 (freq=2000) Assigning to 85 (cl=LD_REGS, orig=50, freq=3000, tfirst=85, tfreq=3000)... Assign 30 to reload r85 (freq=3000) Assigning to 86 (cl=LD_REGS, orig=52, freq=3000, tfirst=86, tfreq=3000)... Assign 30 to reload r86 (freq=3000) Assigning to 87 (cl=LD_REGS, orig=54, freq=3000, tfirst=87, tfreq=3000)... Assign 30 to reload r87 (freq=3000) Assigning to 88 (cl=LD_REGS, orig=56, freq=3000, tfirst=88, tfreq=3000)... Assign 30 to reload r88 (freq=3000) Assigning to 89 (cl=LD_REGS, orig=58, freq=3000, tfirst=89, tfreq=3000)... Assign 26 to reload r89 (freq=3000) Assigning to 90 (cl=LD_REGS, orig=77, freq=3000, tfirst=90, tfreq=3000)... Assign 24 to reload r90 (freq=3000) Reassigning non-reload pseudos Assign 28 to r60 (freq=2000) where r28 is the eliminated FRAME_POINTER_REGNUM. So far so good (AFAICT). But when trying to reload insn 48 after the spill, we use get_reg_class to work out what class the base register (r60) has. This is structured as: if (! HARD_REGISTER_NUM_P (hard_regno = regno)) hard_regno = lra_get_regno_hard_regno (regno); if (hard_regno >= 0) { hard_regno = lra_get_elimination_hard_regno (hard_regno); return REGNO_REG_CLASS (hard_regno); } So we replace r60 with r28 thanks to the assignment, but then apply eliminations to r28 to get r32 (the stack pointer). Things go downhill from there. This patch is a naive proof-of-concept fix, on the basis that eliminations should only apply to registers that aren't subject to allocation. But I imagine something somewhere else will break.
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>: https://gcc.gnu.org/g:9db997e5ac4a206b9428eb2447fcdc90e37725f4 commit r15-3212-g9db997e5ac4a206b9428eb2447fcdc90e37725f4 Author: Richard Sandiford <richard.sandiford@arm.com> Date: Tue Aug 27 09:48:28 2024 +0100 lra: Don't apply eliminations to allocated registers [PR116321] The sequence of events in this PR is that: - the function has many addresses in which only a single hard base register is acceptable. Let's call the hard register H. - IRA allocates that register to one of the pseudo base registers. Let's call the pseudo register P. - Some of the other addresses that require H occur when P is still live. - LRA therefore has to spill P. - When it reallocates P, LRA chooses to use FRAME_POINTER_REGNUM, which has been eliminated to the stack pointer. (This is ok, since the frame register is free.) - Spilling P causes LRA to reprocess the instruction that uses P. - When reprocessing the address that has P as its base, LRA first applies the new allocation, to get FRAME_POINTER_REGNUM, and then applies the elimination, to get the stack pointer. The last step seems wrong: the elimination should only apply to pre-existing uses of FRAME_POINTER_REGNUM, not to uses that result from allocating pseudos. Applying both means that we get the wrong register number, and therefore the wrong class. The PR is about an existing testcase that fails with LRA on m86k. gcc/ PR middle-end/116321 * lra-constraints.cc (get_hard_regno): Only apply eliminations to existing hard registers. (get_reg_class): Likewise.
(In reply to GCC Commits from comment #6) > The PR is about an existing testcase that fails with LRA on m86k. > > gcc/ > PR middle-end/116321 > * lra-constraints.cc (get_hard_regno): Only apply eliminations > to existing hard registers. > (get_reg_class): Likewise. Hi Richard, as far as I can tell the bug is still somewhere in LRA. With your patch and -Os like in the initial report, the test compiles fine. But with -O0, it still breaks: $ avr-gcc -S lra-bug.c -O0 -mlra lra-bug.c:6:1: internal compiler error: in get_reload_reg, at lra-constraints.cc:755 6 | } | ^ gcc version 15.0.0 20240829 (experimental) (GCC) ...ok, the ICE is actually from a different place now (LRA instead of backend), so the remaining ICE is PR116326 I guess?
The master branch has been updated by Georg-Johann Lay <gjl@gcc.gnu.org>: https://gcc.gnu.org/g:e8378231bb88582274e641e57766da613fe067bd commit r15-3512-ge8378231bb88582274e641e57766da613fe067bd Author: Georg-Johann Lay <avr@gjlay.de> Date: Fri Sep 6 13:47:12 2024 +0200 AVR: lra/116321 - Add test case. PR rtl-optimization/116321 gcc/testsuite/ * gcc.target/avr/torture/lra-pr116321.c: New test.
Fixed by r15-3212. The remaining ICE is a different PR116326.