Bug 106751 - [10 Regression] internal compiler error: in purge_dead_edges with inline-asm goto
Summary: [10 Regression] internal compiler error: in purge_dead_edges with inline-asm ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 12.2.0
: P2 normal
Target Milestone: 10.5
Assignee: Jakub Jelinek
URL:
Keywords: ice-on-valid-code, inline-asm
Depends on:
Blocks:
 
Reported: 2022-08-26 10:09 UTC by Christophe Leroy
Modified: 2023-05-04 07:19 UTC (History)
5 users (show)

See Also:
Host:
Target: powerpc64-linux, aarch64-linux-gnu arm-linux-gnu-eabi
Build:
Known to work:
Known to fail: 4.5.4, 5.4.0
Last reconfirmed: 2022-08-26 00:00:00


Attachments
Preprocessed source (154.64 KB, application/x-xz)
2022-08-26 10:10 UTC, Christophe Leroy
Details
gcc13-pr106751.patch (654 bytes, patch)
2022-12-15 16:53 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christophe Leroy 2022-08-26 10:09:53 UTC
powerpc64-linux-gcc: warning: '-pipe' ignored because '-save-temps' specified
Using built-in specs.
COLLECT_AS_OPTIONS='-maltivec' '-mpower4' '-many'
COLLECT_GCC=/opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc
Target: powerpc64-linux
Configured with: /home/arnd/git/gcc/configure --target=powerpc64-linux --enable-targets=all --prefix=/home/arnd/cross/x86_64/gcc-12.2.0-nolibc/powerpc64-linux --enable-languages=c --without-headers --disable-bootstrap --disable-nls --disable-threads --disable-shared --disable-libmudflap --disable-libssp --disable-libgomp --disable-decimal-float --disable-libquadmath --disable-libatomic --disable-libcc1 --disable-libmpx --enable-checking=release
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC) 
COLLECT_GCC_OPTIONS='-nostdinc' '-I' './arch/powerpc/include' '-I' './arch/powerpc/include/generated' '-I' './include' '-I' './arch/powerpc/include/uapi' '-I' './arch/powerpc/include/generated/uapi' '-I' './include/uapi' '-I' './include/generated/uapi' '-include' './include/linux/compiler-version.h' '-include' './include/linux/kconfig.h' '-include' './include/linux/compiler_types.h' '-D' '__KERNEL__' '-I' './arch/powerpc' '-D' 'HAVE_AS_ATHIGH=1' '-fmacro-prefix-map=./=' '-Wall' '-Wundef' '-Werror=strict-prototypes' '-Wno-trigraphs' '-fno-strict-aliasing' '-fno-common' '-fshort-wchar' '-fno-PIE' '-Werror=implicit-function-declaration' '-Werror=implicit-int' '-Werror=return-type' '-Wno-format-security' '-std=gnu11' '-mlittle-endian' '-m64' '-msoft-float' '-pipe' '-mtraceback=no' '-mabi=elfv2' '-mcmodel=medium' '-mno-pointers-to-nested-functions' '-mcpu=power8' '-mtune=power9' '-mno-altivec' '-mno-vsx' '-fno-asynchronous-unwind-tables' '-mno-strict-align' '-mlittle-endian' '-mstack-protector-guard=tls' '-mstack-protector-guard-reg=r13' '-fno-delete-null-pointer-checks' '-Wno-frame-address' '-Wformat-truncation=0' '-Wformat-overflow=0' '-Wno-address-of-packed-member' '-O2' '-fno-allow-store-data-races' '-Wframe-larger-than=2048' '-fstack-protector-strong' '-Wimplicit-fallthrough=5' '-Wno-main' '-Wno-unused-but-set-variable' '-Wunused-const-variable=0' '-fno-stack-clash-protection' '-pg' '-mprofile-kernel' '-Wdeclaration-after-statement' '-Wvla' '-Wno-pointer-sign' '-Wcast-function-type' '-Wno-stringop-truncation' '-Wstringop-overflow=0' '-Wno-restrict' '-Wno-maybe-uninitialized' '-Walloc-size-larger-than=18446744073709551615EiB' '-fno-strict-overflow' '-fstack-check=no' '-fconserve-stack' '-Werror=date-time' '-Werror=incompatible-pointer-types' '-Werror=designated-init' '-Wno-packed-not-aligned' '-mstack-protector-guard-offset=3200' '-D' 'KBUILD_MODFILE="kernel/locking/rtmutex_api"' '-D' 'KBUILD_BASENAME="rtmutex_api"' '-D' 'KBUILD_MODNAME="rtmutex_api"' '-D' '__KBUILD_MODNAME=kmod_rtmutex_api' '-c' '-o' 'kernel/locking/rtmutex_api.o' '-save-temps' '-v' '-dumpdir' 'kernel/locking/'
 /opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/../libexec/gcc/powerpc64-linux/12.2.0/cc1 -E -quiet -nostdinc -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I ./include -I ./arch/powerpc/include/uapi -I ./arch/powerpc/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -I ./arch/powerpc -imultilib le -iprefix /opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/12.2.0/ -D __KERNEL__ -D HAVE_AS_ATHIGH=1 -D KBUILD_MODFILE="kernel/locking/rtmutex_api" -D KBUILD_BASENAME="rtmutex_api" -D KBUILD_MODNAME="rtmutex_api" -D __KBUILD_MODNAME=kmod_rtmutex_api -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -MMD kernel/locking/.rtmutex_api.o.d kernel/locking/rtmutex_api.c -mlittle-endian -m64 -msoft-float -mtraceback=no -mabi=elfv2 -mcmodel=medium -mno-pointers-to-nested-functions -mcpu=power8 -mtune=power9 -mno-altivec -mno-vsx -mno-strict-align -mlittle-endian -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -mprofile-kernel -mstack-protector-guard-offset=3200 -std=gnu11 -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -Wno-frame-address -Wformat-truncation=0 -Wformat-overflow=0 -Wno-address-of-packed-member -Wframe-larger-than=2048 -Wimplicit-fallthrough=5 -Wno-main -Wno-unused-but-set-variable -Wunused-const-variable=0 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation -Wstringop-overflow=0 -Wno-restrict -Wno-maybe-uninitialized -Walloc-size-larger-than=18446744073709551615EiB -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -fmacro-prefix-map=./= -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -fno-allow-store-data-races -fstack-protector-strong -fno-stack-clash-protection -fno-strict-overflow -fstack-check=no -fconserve-stack -O2 -fpch-preprocess -o kernel/locking/rtmutex_api.i
#include "..." search starts here:
#include <...> search starts here:
 ./arch/powerpc/include
 ./arch/powerpc/include/generated
 ./include
 ./arch/powerpc/include/uapi
 ./arch/powerpc/include/generated/uapi
 ./include/uapi
 ./include/generated/uapi
 ./arch/powerpc
End of search list.
COLLECT_GCC_OPTIONS='-nostdinc' '-I' './arch/powerpc/include' '-I' './arch/powerpc/include/generated' '-I' './include' '-I' './arch/powerpc/include/uapi' '-I' './arch/powerpc/include/generated/uapi' '-I' './include/uapi' '-I' './include/generated/uapi' '-include' './include/linux/compiler-version.h' '-include' './include/linux/kconfig.h' '-include' './include/linux/compiler_types.h' '-D' '__KERNEL__' '-I' './arch/powerpc' '-D' 'HAVE_AS_ATHIGH=1' '-fmacro-prefix-map=./=' '-Wall' '-Wundef' '-Werror=strict-prototypes' '-Wno-trigraphs' '-fno-strict-aliasing' '-fno-common' '-fshort-wchar' '-fno-PIE' '-Werror=implicit-function-declaration' '-Werror=implicit-int' '-Werror=return-type' '-Wno-format-security' '-std=gnu11' '-mlittle-endian' '-m64' '-msoft-float' '-pipe' '-mtraceback=no' '-mabi=elfv2' '-mcmodel=medium' '-mno-pointers-to-nested-functions' '-mcpu=power8' '-mtune=power9' '-mno-altivec' '-mno-vsx' '-fno-asynchronous-unwind-tables' '-mno-strict-align' '-mlittle-endian' '-mstack-protector-guard=tls' '-mstack-protector-guard-reg=r13' '-fno-delete-null-pointer-checks' '-Wno-frame-address' '-Wformat-truncation=0' '-Wformat-overflow=0' '-Wno-address-of-packed-member' '-O2' '-fno-allow-store-data-races' '-Wframe-larger-than=2048' '-fstack-protector-strong' '-Wimplicit-fallthrough=5' '-Wno-main' '-Wno-unused-but-set-variable' '-Wunused-const-variable=0' '-fno-stack-clash-protection' '-pg' '-mprofile-kernel' '-Wdeclaration-after-statement' '-Wvla' '-Wno-pointer-sign' '-Wcast-function-type' '-Wno-stringop-truncation' '-Wstringop-overflow=0' '-Wno-restrict' '-Wno-maybe-uninitialized' '-Walloc-size-larger-than=18446744073709551615EiB' '-fno-strict-overflow' '-fstack-check=no' '-fconserve-stack' '-Werror=date-time' '-Werror=incompatible-pointer-types' '-Werror=designated-init' '-Wno-packed-not-aligned' '-mstack-protector-guard-offset=3200' '-D' 'KBUILD_MODFILE="kernel/locking/rtmutex_api"' '-D' 'KBUILD_BASENAME="rtmutex_api"' '-D' 'KBUILD_MODNAME="rtmutex_api"' '-D' '__KBUILD_MODNAME=kmod_rtmutex_api' '-c' '-o' 'kernel/locking/rtmutex_api.o' '-save-temps' '-v' '-dumpdir' 'kernel/locking/'
 /opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/../libexec/gcc/powerpc64-linux/12.2.0/cc1 -fpreprocessed kernel/locking/rtmutex_api.i -quiet -dumpdir kernel/locking/ -dumpbase rtmutex_api.c -dumpbase-ext .c -mlittle-endian -m64 -msoft-float -mtraceback=no -mabi=elfv2 -mcmodel=medium -mno-pointers-to-nested-functions -mcpu=power8 -mtune=power9 -mno-altivec -mno-vsx -mno-strict-align -mlittle-endian -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -mprofile-kernel -mstack-protector-guard-offset=3200 -O2 -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -Wno-frame-address -Wformat-truncation=0 -Wformat-overflow=0 -Wno-address-of-packed-member -Wframe-larger-than=2048 -Wimplicit-fallthrough=5 -Wno-main -Wno-unused-but-set-variable -Wunused-const-variable=0 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation -Wstringop-overflow=0 -Wno-restrict -Wno-maybe-uninitialized -Walloc-size-larger-than=18446744073709551615EiB -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -std=gnu11 -version -p -fmacro-prefix-map=./= -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -fno-allow-store-data-races -fstack-protector-strong -fno-stack-clash-protection -fno-strict-overflow -fstack-check=no -fconserve-stack -o kernel/locking/rtmutex_api.s
GNU C11 (GCC) version 12.2.0 (powerpc64-linux)
	compiled by GNU C version 8.3.0, GMP version 6.1.0, MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C11 (GCC) version 12.2.0 (powerpc64-linux)
	compiled by GNU C version 8.3.0, GMP version 6.1.0, MPFR version 3.1.6, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 85577641c47121e3caf5e8a8694ad06b
during RTL pass: combine
In file included from kernel/locking/rtmutex_api.c:9:
kernel/locking/rtmutex.c: In function '__rt_mutex_slowlock.constprop':
kernel/locking/rtmutex.c:1612:1: internal compiler error: in purge_dead_edges, at cfgrtl.cc:3347
 1612 | }
      | ^
Please submit a full bug report, with preprocessed source (by using -freport-bug).
See <https://gcc.gnu.org/bugs/> for instructions.
Comment 1 Christophe Leroy 2022-08-26 10:10:58 UTC
Created attachment 53512 [details]
Preprocessed source
Comment 2 Christophe Leroy 2022-08-26 12:20:30 UTC
Same problem with at least gcc 11.3, 10.3, 9.5, 8.5 and 5.5
Comment 3 Martin Liška 2022-08-26 12:38:10 UTC
Reduced to:

$ cat rtmutex_api.i
struct task_struct *get_current();
struct task_struct {
  int __state;
} __attribute____rt_mutex_slowlock_locked() {
  asm goto("" : : : : __label_warn_on);
  __builtin_unreachable();
__label_warn_on:
  while (1)
    get_current()->__state = 0;
}

$ ppc64-linux-gnu-gcc rtmutex_api.i -c -O2
during RTL pass: combine
rtmutex_api.i: In function ‘__attribute____rt_mutex_slowlock_locked’:
rtmutex_api.i:10:1: internal compiler error: in purge_dead_edges, at cfgrtl.cc:3347
   10 | }
      | ^
0x5e35b6 purge_dead_edges(basic_block_def*)
	/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-trunk-ppc64/build/gcc/cfgrtl.cc:3347
0x773947 purge_all_dead_edges()
	/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-trunk-ppc64/build/gcc/cfgrtl.cc:3368
0x13dc33f combine_instructions
	/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-trunk-ppc64/build/gcc/combine.cc:1434
0x13dc33f rest_of_handle_combine
	/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-trunk-ppc64/build/gcc/combine.cc:14978
0x13dc33f execute
	/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-trunk-ppc64/build/gcc/combine.cc:15023
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Comment 4 Andrew Pinski 2022-08-26 18:08:45 UTC
On x86_64 we get:

(jump_insn 5 2 8 2 (parallel [
            (asm_operands/v ("# %0") ("") 0 []
                 []
                 [
                    (label_ref:DI 8)
                ] /app/example.cpp:7)
            (clobber (reg:CC 17 flags))
        ]) "/app/example.cpp":7:3 -1
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil))
 -> 8)
(code_label 8 5 9 3 2 ("__label_warn_on") [1 uses])
(note 9 8 10 3 [bb 3] NOTE_INSN_BASIC_BLOCK)


But on PowerPC we get:

(jump_insn 5 2 12 2 (parallel [
            (asm_operands/v ("# %0") ("") 0 []
                 []
                 [
                    (label_ref:SI 8)
                ] /app/example.cpp:7)
            (clobber (reg:SI 98 ca))
        ]) "/app/example.cpp":7:3 -1
     (expr_list:REG_UNUSED (reg:SI 98 ca)
        (nil))
 -> 8)
(insn 12 5 8 2 (set (reg:SI 118)
        (const_int 0 [0])) "/app/example.cpp":11:28 545 {*movsi_internal1}
     (nil))
(code_label 8 12 9 3 2 ("__label_warn_on") [1 uses])

Notice the move before the the label, that is wrong.
Comment 5 Andrew Pinski 2022-08-26 18:14:06 UTC
loop2_invariant moves the instruction.
You can also reproduce the failure on aarch64 with the following:
```
struct task_struct;
struct task_struct *get_current();
struct task_struct {
  int __state;
};
void f() {
  asm goto("# %0" : : : : __label_warn_on);
  __builtin_unreachable();
__label_warn_on:
  while (1)
    get_current()->__state = 1;
}
```
Which changes 0 for 1(on aarch64, zero can be used for a store which is why it works for 0).
It works on x86_64 because the move accepts the 1.

Note on the trunk, the ICE is the following:
<source>: In function 'void f()':
<source>:12:1: error: in basic block 2:
   12 | }
      | ^
<source>:12:1: error: flow control insn inside a basic block
(jump_insn 5 2 12 2 (asm_operands/v ("# %0") ("") 0 []
         []
         [
            (label_ref:DI 8)
        ] <source>:7) "<source>":7:3 -1
     (nil)
 -> 8)
during RTL pass: loop2_invariant
dump file: /app/output.cpp.272r.loop2_invariant
<source>:12:1: internal compiler error: in rtl_verify_bb_insns, at cfgrtl.cc:2797
Please submit a full bug report, with preprocessed source (by using -freport-bug).
See <https://gcc.gnu.org/bugs/> for instructions.
Compiler returned: 1
Comment 6 Andrew Pinski 2022-08-26 18:21:07 UTC
Fails all the way back to 4.5.4 with my reduced testcase on arm-linux-gnu-eabi . I suspect it has always failed and if it passed with a larger testcase then it was just a latent bug being exposed.
Comment 7 Segher Boessenkool 2022-09-13 15:50:54 UTC
For me (powerpc64-linux) it fails with

===
106751.c:10:1: error: flow control insn inside a basic block
(jump_insn 6 3 13 2 (parallel [
            (asm_operands/v ("") ("") 0 []
                 []
                 [
                    (label_ref:DI 9)
                ] 106751.c:5)
            (clobber (reg:SI 98 ca))
        ]) "106751.c":5:3 -1
     (expr_list:REG_UNUSED (reg:SI 98 ca)
        (nil))
 -> 9)
during RTL pass: loop2_invariant
===

That pass did
===
Set in insn 13 is invariant (0), cost 4, depends on 
Decided to move invariant 0 -- gain 4
Invariant 0 moved without introducing a new temporary register
changing bb of uid 13
  from 3 to 2
===

It moved the insn after a jump_insn, not a good idea:

===
(jump_insn 6 3 13 2 (parallel [
            (asm_operands/v ("") ("") 0 []
                 []
                 [
                    (label_ref:DI 9)
                ] 106751.c:5)
            (clobber (reg:SI 98 ca))
        ]) "106751.c":5:3 -1
     (expr_list:REG_UNUSED (reg:SI 98 ca)
        (nil))
 -> 9)
(insn 13 6 9 2 (set (reg:SI 119)
        (const_int 0 [0])) "106751.c":9:28 562 {*movsi_internal1}
     (nil))
;;  succ:       3 [always]  count:10631108 (estimated locally)
===

Maybe it does not see this is an unconditional jump insn?
Comment 8 Jakub Jelinek 2022-12-15 13:10:31 UTC
Reduced testcase that ICEs on x86_64-linux too (-O2):
int *foo (void);

void
bar (void)
{
  asm goto ("" : : : : lab);
  __builtin_unreachable ();
lab:
  while (1)
    {
      int o;
      asm ("" : "=r" (o) : "g" (1));
      *foo () = o;
    }
}
Started with r0-128630-ga3afdbb80906a5553a64f9ba7686a57d2f43f536
Comment 9 Jakub Jelinek 2022-12-15 15:51:12 UTC
I'm afraid I don't know much about the RTL lim, but
1840	      reorder_insns (inv->insn, inv->insn, BB_END (preheader));
with no check whether it is actually ok to append the statement at the end of the preheader bb (at least no check that I can find) sounds wrong to me and I'm just confused why it doesn't break stuff much more often.
From what I can see, loop-invariant.cc pass happens in between loop2_init and loop2_done passes where the former calls loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
which should ensure that loops have preheaders, but doesn't actually ask for
LOOPS_HAVE_FALLTHRU_PREHEADERS.
If the latter was on, then create_preheader would force splitting of the edge:
          /* If we want fallthru preheaders, also create forwarder block when
             preheader ends with a jump or has predecessors from loop.  */
          else if ((flags & CP_FALLTHRU_PREHEADERS)
                   && (JUMP_P (BB_END (single_entry->src))
                       || has_preds_from_loop (single_entry->src, loop)))
            need_forwarder_block = true;
So, should loop-invariant.cc split_edge the preheader edge if preheader block ends with JUMP_P, or shall we somehow arrange for LOOPS_HAVE_FALLTHRU_PREHEADERS?
Comment 10 Jakub Jelinek 2022-12-15 16:53:17 UTC
Created attachment 54103 [details]
gcc13-pr106751.patch

This seems to work on the testcase.
Comment 11 Richard Biener 2022-12-15 18:05:55 UTC
(In reply to Jakub Jelinek from comment #10)
> Created attachment 54103 [details]
> gcc13-pr106751.patch
> 
> This seems to work on the testcase.

looks reasonable.
Comment 12 GCC Commits 2022-12-16 09:20:36 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ddcaa60983b50378bde1b7e327086fe0ce101795

commit r13-4739-gddcaa60983b50378bde1b7e327086fe0ce101795
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Dec 16 10:19:22 2022 +0100

    loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]
    
    The RTL loop passes only request simple preheaders, but don't require
    fallthru preheaders, while move_invariant_reg apparently assumes the
    latter, that it can just append instruction(s) to the end of the preheader
    basic block.
    
    The following patch fixes that by splitting the preheader edge if
    the preheader bb ends with a JUMP_INSN (asm goto in this case).
    Without that we get control flow in the middle of a bb.
    
    2022-12-16  Jakub Jelinek  <jakub@redhat.com>
    
            PR rtl-optimization/106751
            * loop-invariant.cc (move_invariant_reg): If preheader bb ends
            with a JUMP_INSN, split the preheader edge and emit invariants
            into the new preheader basic block.
    
            * gcc.c-torture/compile/pr106751.c: New test.
Comment 13 Jakub Jelinek 2022-12-16 09:23:14 UTC
Fixed on the trunk so far.
Comment 14 GCC Commits 2023-02-10 17:44:36 UTC
The releases/gcc-12 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:4db6e1bf2f1647521dcd709bc3673f565fc327a5

commit r12-9128-g4db6e1bf2f1647521dcd709bc3673f565fc327a5
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Dec 16 10:19:22 2022 +0100

    loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]
    
    The RTL loop passes only request simple preheaders, but don't require
    fallthru preheaders, while move_invariant_reg apparently assumes the
    latter, that it can just append instruction(s) to the end of the preheader
    basic block.
    
    The following patch fixes that by splitting the preheader edge if
    the preheader bb ends with a JUMP_INSN (asm goto in this case).
    Without that we get control flow in the middle of a bb.
    
    2022-12-16  Jakub Jelinek  <jakub@redhat.com>
    
            PR rtl-optimization/106751
            * loop-invariant.cc (move_invariant_reg): If preheader bb ends
            with a JUMP_INSN, split the preheader edge and emit invariants
            into the new preheader basic block.
    
            * gcc.c-torture/compile/pr106751.c: New test.
    
    (cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)
Comment 15 Jakub Jelinek 2023-02-10 17:56:55 UTC
Fixed for gcc 12.3 too.
Comment 16 GCC Commits 2023-05-02 20:12:50 UTC
The releases/gcc-11 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:945edfa2d24e310bf3549a16f2ea4c881a9a138a

commit r11-10690-g945edfa2d24e310bf3549a16f2ea4c881a9a138a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Dec 16 10:19:22 2022 +0100

    loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]
    
    The RTL loop passes only request simple preheaders, but don't require
    fallthru preheaders, while move_invariant_reg apparently assumes the
    latter, that it can just append instruction(s) to the end of the preheader
    basic block.
    
    The following patch fixes that by splitting the preheader edge if
    the preheader bb ends with a JUMP_INSN (asm goto in this case).
    Without that we get control flow in the middle of a bb.
    
    2022-12-16  Jakub Jelinek  <jakub@redhat.com>
    
            PR rtl-optimization/106751
            * loop-invariant.c (move_invariant_reg): If preheader bb ends
            with a JUMP_INSN, split the preheader edge and emit invariants
            into the new preheader basic block.
    
            * gcc.c-torture/compile/pr106751.c: New test.
    
    (cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)
Comment 17 Jakub Jelinek 2023-05-03 09:26:44 UTC
Fixed for 11.4 as well.
Comment 18 GCC Commits 2023-05-03 15:20:01 UTC
The releases/gcc-10 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:c88f917d2aa1c2ac065fbc0a10298f5551f392b6

commit r10-11348-gc88f917d2aa1c2ac065fbc0a10298f5551f392b6
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Dec 16 10:19:22 2022 +0100

    loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]
    
    The RTL loop passes only request simple preheaders, but don't require
    fallthru preheaders, while move_invariant_reg apparently assumes the
    latter, that it can just append instruction(s) to the end of the preheader
    basic block.
    
    The following patch fixes that by splitting the preheader edge if
    the preheader bb ends with a JUMP_INSN (asm goto in this case).
    Without that we get control flow in the middle of a bb.
    
    2022-12-16  Jakub Jelinek  <jakub@redhat.com>
    
            PR rtl-optimization/106751
            * loop-invariant.c (move_invariant_reg): If preheader bb ends
            with a JUMP_INSN, split the preheader edge and emit invariants
            into the new preheader basic block.
    
            * gcc.c-torture/compile/pr106751.c: New test.
    
    (cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)
Comment 19 Jakub Jelinek 2023-05-04 07:19:25 UTC
Fixed for 10.5 too.