Bug 116321 - [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
Summary: [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: 15.0
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code, ra
Depends on:
Blocks: avr+ra 113932 113934
  Show dependency treegraph
 
Reported: 2024-08-10 09:09 UTC by Georg-Johann Lay
Modified: 2024-09-06 11:53 UTC (History)
2 users (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-08-10 00:00:00


Attachments
Proof of concept patch (526 bytes, patch)
2024-08-20 22:09 UTC, Richard Sandiford
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2024-08-10 09:09:47 UTC
typedef __UINT64_TYPE__ uint64_t;

uint64_t fun64 (const __flash uint64_t *p)
{
  return *p;
}

runs into an ICE:

$ avr-gcc lra-bug.c -S -Os -da -mlra
during RTL pass: shorten
dump file: lra-bug.c.354r.shorten
lra-bug.c: In function 'fun64':
lra-bug.c:6:1: internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
    6 | }
      | ^

The respective line in avr.cc reads:

      gcc_assert (REG_Z == REGNO (addr));

because the only addressing modes for AS1 __flash are REG and POST_INC of REG_Z (reg:HI 30).  However, the insn fed into the function as produced by LRA is like found in lra-bug.c.317r.reload:

(insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ])
        (mem:QI (reg/f:HI 28 r28 [60]) [1 *p_2(D)+7 S1 A8 AS1])) "lra-bug.c":6:1 86 {movqi_insn_split}
     (nil))

This insn clearly violates avr.cc's REGNO_MODE_CODE_OK_FOR_BASE_P which only allows REG_Z (regno 30) as register for non-generic address-spaces like AS1. And avr.cc'c MODE_CODE_BASE_REG_CLASS has:

  if (!ADDR_SPACE_GENERIC_P (as))
    {
      return POINTER_Z_REGS;
    }

but reg:HI 28 in insn 48 is not an element of POINTER_Z_REGS.


Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
Comment 1 Georg-Johann Lay 2024-08-10 16:59:25 UTC
What I do not understand is when I also set -mlog=legitimate_address_p then I only get logs that have strict=0 and not a single one with strict=1, like:

avr_addr_space_legitimate_address_p[fun64:split5(357)]: ret=true, mode=QI strict=0 reload_completed=1 reload_in_progress=0 (reg_renumber):
(reg/f:HI 28 r28 [60])

This is for pass .split5 that runy way after reload, and strict=0 doesn't make much sense to me.
Comment 2 Richard Biener 2024-08-19 14:02:35 UTC
See the m68k bug - LRA/IRA _never_ use strict = 1
Comment 3 Andreas Schwab 2024-08-19 14:20:53 UTC
But postreload still does.
Comment 4 Georg-Johann Lay 2024-08-19 14:29:45 UTC
(In reply to Richard Biener from comment #2)
> See the m68k bug - LRA/IRA _never_ use strict = 1
You mean PR116236? Its fix says:

> This matters on targets like m68k that support index extension
> and that require different classes for bases and indices.
I still see the ICE on avr with

gcc version 15.0.0 20240818 (experimental) (GCC)

but PR116236 has been fixes 2024-08-15.
Comment 5 Richard Sandiford 2024-08-20 22:09:44 UTC
Created attachment 58964 [details]
Proof of concept patch

The problem seems to be a bad interaction between spilling and register elimination.  I think it would only occur on targets that use a real target register as FRAME_POINTER_REGNUM, rather than having a separate FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM.

IRA sees:

(insn 47 46 48 2 (set (reg:QI 24 r24 [+6 ])
        (mem:QI (reg/f:HI 58) [1 *p_2(D)+6 S1 A8 AS1])) "pr116321.c":6:1 86 {movqi_insn_split}
     (expr_list:REG_DEAD (reg/f:HI 58)
        (nil)))
(insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ])
        (mem:QI (reg/f:HI 60) [1 *p_2(D)+7 S1 A8 AS1])) "pr116321.c":6:1 86 {movqi_insn_split}
     (expr_list:REG_DEAD (reg/f:HI 60)
        (nil)))

where both instructions need an r30 base.  As it happens, IRA assigns r30 to r60 .  However, when it comes to reloading earlier insns, we have to use r30 for the base even though r60 is live, so we need to spill r60:

********** Assignment #1: **********

         Assigning to 91 (cl=POINTER_Z_REGS, orig=58, freq=2000, tfirst=91, tfreq=2000)...
         Trying 30: spill 60(freq=2000)  Now best 30(cost=0, bad_spills=0, insn_pseudos=0)

         Trying 31: spill 60(freq=2000)
      Spill r60(hr=30, freq=2000) for r91
           Assign 30 to reload r91 (freq=2000)
         Assigning to 85 (cl=LD_REGS, orig=50, freq=3000, tfirst=85, tfreq=3000)...
           Assign 30 to reload r85 (freq=3000)
         Assigning to 86 (cl=LD_REGS, orig=52, freq=3000, tfirst=86, tfreq=3000)...
           Assign 30 to reload r86 (freq=3000)
         Assigning to 87 (cl=LD_REGS, orig=54, freq=3000, tfirst=87, tfreq=3000)...
           Assign 30 to reload r87 (freq=3000)
         Assigning to 88 (cl=LD_REGS, orig=56, freq=3000, tfirst=88, tfreq=3000)...
           Assign 30 to reload r88 (freq=3000)
         Assigning to 89 (cl=LD_REGS, orig=58, freq=3000, tfirst=89, tfreq=3000)...
           Assign 26 to reload r89 (freq=3000)
         Assigning to 90 (cl=LD_REGS, orig=77, freq=3000, tfirst=90, tfreq=3000)...
           Assign 24 to reload r90 (freq=3000)
  Reassigning non-reload pseudos
           Assign 28 to r60 (freq=2000)

where r28 is the eliminated FRAME_POINTER_REGNUM.  So far so good (AFAICT).  But when trying to reload insn 48 after the spill, we use get_reg_class to work out what class the base register (r60) has.  This is structured as:

  if (! HARD_REGISTER_NUM_P (hard_regno = regno))
    hard_regno = lra_get_regno_hard_regno (regno);
  if (hard_regno >= 0)
    {
      hard_regno = lra_get_elimination_hard_regno (hard_regno);
      return REGNO_REG_CLASS (hard_regno);
    }

So we replace r60 with r28 thanks to the assignment, but then apply eliminations to r28 to get r32 (the stack pointer).  Things go downhill from there.

This patch is a naive proof-of-concept fix, on the basis that eliminations should only apply to registers that aren't subject to allocation.  But I imagine something somewhere else will break.
Comment 6 GCC Commits 2024-08-27 08:48:41 UTC
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:9db997e5ac4a206b9428eb2447fcdc90e37725f4

commit r15-3212-g9db997e5ac4a206b9428eb2447fcdc90e37725f4
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Tue Aug 27 09:48:28 2024 +0100

    lra: Don't apply eliminations to allocated registers [PR116321]
    
    The sequence of events in this PR is that:
    
    - the function has many addresses in which only a single hard base
      register is acceptable.  Let's call the hard register H.
    
    - IRA allocates that register to one of the pseudo base registers.
      Let's call the pseudo register P.
    
    - Some of the other addresses that require H occur when P is still live.
    
    - LRA therefore has to spill P.
    
    - When it reallocates P, LRA chooses to use FRAME_POINTER_REGNUM,
      which has been eliminated to the stack pointer.  (This is ok,
      since the frame register is free.)
    
    - Spilling P causes LRA to reprocess the instruction that uses P.
    
    - When reprocessing the address that has P as its base, LRA first
      applies the new allocation, to get FRAME_POINTER_REGNUM,
      and then applies the elimination, to get the stack pointer.
    
    The last step seems wrong: the elimination should only apply to
    pre-existing uses of FRAME_POINTER_REGNUM, not to uses that result
    from allocating pseudos.  Applying both means that we get the wrong
    register number, and therefore the wrong class.
    
    The PR is about an existing testcase that fails with LRA on m86k.
    
    gcc/
            PR middle-end/116321
            * lra-constraints.cc (get_hard_regno): Only apply eliminations
            to existing hard registers.
            (get_reg_class): Likewise.
Comment 7 Georg-Johann Lay 2024-08-29 16:43:08 UTC
(In reply to GCC Commits from comment #6)
>     The PR is about an existing testcase that fails with LRA on m86k.
>     
>     gcc/
>             PR middle-end/116321
>             * lra-constraints.cc (get_hard_regno): Only apply eliminations
>             to existing hard registers.
>             (get_reg_class): Likewise.
Hi Richard, as far as I can tell the bug is still somewhere in LRA.

With your patch and -Os like in the initial report, the test compiles fine.  But with -O0, it still breaks:

$ avr-gcc -S lra-bug.c -O0 -mlra

lra-bug.c:6:1: internal compiler error: in get_reload_reg, at lra-constraints.cc:755
    6 | }
      | ^

gcc version 15.0.0 20240829 (experimental) (GCC)

...ok, the ICE is actually from a different place now (LRA instead of backend), so the remaining ICE is PR116326 I guess?
Comment 8 GCC Commits 2024-09-06 11:50:04 UTC
The master branch has been updated by Georg-Johann Lay <gjl@gcc.gnu.org>:

https://gcc.gnu.org/g:e8378231bb88582274e641e57766da613fe067bd

commit r15-3512-ge8378231bb88582274e641e57766da613fe067bd
Author: Georg-Johann Lay <avr@gjlay.de>
Date:   Fri Sep 6 13:47:12 2024 +0200

    AVR: lra/116321 - Add test case.
    
            PR rtl-optimization/116321
    gcc/testsuite/
            * gcc.target/avr/torture/lra-pr116321.c: New test.
Comment 9 Georg-Johann Lay 2024-09-06 11:53:11 UTC
Fixed by r15-3212.  The remaining ICE is a different PR116326.