Steps to reproduce: 1. Install kas tool 2. Clone https://github.com/Igalia/meta-chromium 3. Kick checkout of repositories: kas checkout kas/chromium.yml:kas/commercial.yml 3. Kick build for raspberrypi4-64: KAS_MACHINE=raspberrypi4-64 kas build kas/chromium.yml:kas/commercial.yml Compilation will progress, but then fail on building Chromium: FAILED: obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o aarch64-poky-linux-g++ -mcpu=cortex-a72 -march=armv8-a+crc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -Werror=format-security --sysroot=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot -MMD -MF obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o.d -DUSE_UDEV -DUSE_AURA=1 -DUSE_GLIB=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_56 -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_56 -DBASE_USE_PERFETTO_CLIENT_LIBRARY=1 -DGOOGLE_PROTOBUF_NO_RTTI -DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER -DGOOGLE_PROTOBUF_INTERNAL_DONATE_STEAL_INLINE=0 -DHAVE_PTHREAD -I../chromium-114.0.5696.0 -Igen -I../chromium-114.0.5696.0/third_party/distributed_point_functions -I../chromium-114.0.5696.0/third_party/distributed_point_functions/code -Igen/third_party/distributed_point_functions -I../chromium-114.0.5696.0/third_party/perfetto/include -Igen/third_party/perfetto/build_config -Igen/third_party/perfetto -I../chromium-114.0.5696.0/third_party/protobuf/src -Igen/protoc_out -I../chromium-114.0.5696.0/third_party/abseil-cpp -I../chromium-114.0.5696.0/third_party/highway/src -I../chromium-114.0.5696.0/third_party/boringssl/src/include -fno-ident -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -mbranch-protection=standard -O2 -fdata-sections -ffunction-sections -fno-omit-frame-pointer -gdwarf-4 -g1 -fvisibility=hidden -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -Wno-comments -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-unused-parameter -Wno-psabi -I/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot/usr/include/glib-2.0 -I/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot/usr/lib/glib-2.0/include -std=gnu++2a -fno-exceptions -fno-rtti -fvisibility-inlines-hidden -Wno-narrowing -Wno-class-memaccess -feliminate-unused-debug-types -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/chromium-114.0.5696.0=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/chromium-114.0.5696.0=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/build=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/build=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot= -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot= -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot-native= -fvisibility-inlines-hidden -c ../chromium-114.0.5696.0/third_party/distributed_point_functions/code/dpf/internal/evaluate_prg_hwy.cc -o obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o {standard input}: Assembler messages: {standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive aarch64-poky-linux-g++: fatal error: Killed signal terminated program cc1plus compilation terminated. ninja: build stopped: subcommand failed. WARNING: exit code 1 from a shell command. This is after exhausting all the available memory in system. Attaching gdb to GCC I see it fails to finish running pass_forwprop::execute in this backtrace: #0 (anonymous namespace)::pass_forwprop::execute (this=<optimized out>, fun=<optimized out>) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/tree-ssa-forwprop.cc:3636 #1 0x0000000000d0b653 in execute_one_pass (pass=0x2f96120) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/passes.cc:2638 #2 0x0000000000d0bea0 in execute_pass_list_1 (pass=0x2f96120) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/passes.cc:2738 #3 0x0000000000d0beb2 in execute_pass_list_1 (pass=0x2f951b0) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/passes.cc:2739 #4 0x0000000000d0bedd in execute_pass_list (fn=0x7fe3397a88a0, pass=<optimized out>) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/passes.cc:2749 #5 0x00000000009b7e28 in cgraph_node::expand (this=0x7fe331da2990) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/context.h:48 #6 cgraph_node::expand (this=0x7fe331da2990) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/cgraphunit.cc:1788 #7 0x00000000009b9387 in expand_all_functions () at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/cgraphunit.cc:1999 #8 symbol_table::compile (this=0x7fe33ffa3000) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/cgraphunit.cc:2349 #9 0x00000000009bb91c in symbol_table::compile (this=0x7fe33ffa3000) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/cgraphunit.cc:2262 #10 symbol_table::finalize_compilation_unit (this=0x7fe33ffa3000) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/cgraphunit.cc:2530 #11 0x0000000000ddcbca in compile_file () at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/toplev.cc:479 #12 0x00000000006c8a8e in do_compile (no_backend=false) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/toplev.cc:2144 #13 toplev::main (this=this@entry=0x7fffd81b45d6, argc=<optimized out>, argc@entry=140, argv=<optimized out>, argv@entry=0x7fffd81b4708) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/toplev.cc:2296 #14 0x00000000006ca1af in main (argc=140, argv=0x7fffd81b4708) at ../../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/gcc/main.cc:39 Apparently the loop will never finish as gsi gets members added and removed forever before OOM.
Can you provide the information as requested at https://gcc.gnu.org/bugs/ ?
Information collected: ### g++ -v aarch64-poky-linux-g++ -v Using built-in specs. COLLECT_GCC=./home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++ COLLECT_LTO_WRAPPER=/home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/image/home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/../../libexec/aarch64-poky-linux/gcc/aarch64-poky-linux/12.2.0/lto-wrapper Target: aarch64-poky-linux Configured with: ../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/configure --build=x86_64-linux --host=x86_64-linux --target=aarch64-poky-linux --prefix=/host-native/usr --exec_prefix=/host-native/usr --bindir=/host-native/usr/bin/aarch64-poky-linux --sbindir=/host-native/usr/bin/aarch64-poky-linux --libexecdir=/host-native/usr/libexec/aarch64-poky-linux --datadir=/host-native/usr/share --sysconfdir=/host-native/etc --sharedstatedir=/host-native/com --localstatedir=/host-native/var --libdir=/host-native/usr/lib/aarch64-poky-linux --includedir=/host-native/usr/include --oldincludedir=/host-native/usr/include --infodir=/host-native/usr/share/info --mandir=/host-native/usr/share/man --disable-silent-rules --disable-dependency-tracking --with-libtool-sysroot=/host-native --enable-clocale=generic --with-gnu-ld --enable-shared --enable-languages=c,c++ --enable-threads=posix --disable-multilib --enable-default-pie --enable-c99 --enable-long-long --enable-symvers=gnu --enable-libstdcxx-pch --program-prefix=aarch64-poky-linux- --without-local-prefix --disable-install-libiberty --disable-libssp --enable-libitm --enable-lto --disable-bootstrap --with-system-zlib --with-linker-hash-style=sysv --enable-linker-build-id --with-ppl=no --with-cloog=no --enable-checking=release --enable-cheaders=c_global --without-isl --with-gxx-include-dir=/not/exist/usr/include/c++/12.2.0 --with-sysroot=/not/exist --with-build-sysroot=/host --enable-standard-branch-protection --enable-poison-system-directories=error --with-system-zlib --disable-static --disable-nls --with-glibc-version=2.28 --enable-initfini-array --enable-__cxa_atexit Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (GCC) ### Extracted information #### GCC version 12.2.0 #### System type Toolchain built by Yocto Langdale with the mentioned steps, on Ubuntu 22.10 #### GCC configure options Configured with: ../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/configure --build=x86_64-linux --host=x86_64-linux --target=aarch64-poky-linux --prefix=/host-native/usr --exec_prefix=/host-native/usr --bindir=/host-native/usr/bin/aarch64-poky-linux --sbindir=/host-native/usr/bin/aarch64-poky-linux --libexecdir=/host-native/usr/libexec/aarch64-poky-linux --datadir=/host-native/usr/share --sysconfdir=/host-native/etc --sharedstatedir=/host-native/com --localstatedir=/host-native/var --libdir=/host-native/usr/lib/aarch64-poky-linux --includedir=/host-native/usr/include --oldincludedir=/host-native/usr/include --infodir=/host-native/usr/share/info --mandir=/host-native/usr/share/man --disable-silent-rules --disable-dependency-tracking --with-libtool-sysroot=/host-native --enable-clocale=generic --with-gnu-ld --enable-shared --enable-languages=c,c++ --enable-threads=posix --disable-multilib --enable-default-pie --enable-c99 --enable-long-long --enable-symvers=gnu --enable-libstdcxx-pch --program-prefix=aarch64-poky-linux- --without-local-prefix --disable-install-libiberty --disable-libssp --enable-libitm --enable-lto --disable-bootstrap --with-system-zlib --with-linker-hash-style=sysv --enable-linker-build-id --with-ppl=no --with-cloog=no --enable-checking=release --enable-cheaders=c_global --without-isl --with-gxx-include-dir=/not/exist/usr/include/c++/12.2.0 --with-sysroot=/not/exist --with-build-sysroot=/host --enable-standard-branch-protection --enable-poison-system-directories=error --with-system-zlib --disable-static --disable-nls --with-glibc-version=2.28 --enable-initfini-array --enable-__cxa_atexit #### Complete command line that triggers the bug aarch64-poky-linux-g++ -mcpu=cortex-a72 -march=armv8-a+crc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -Werror=format-security --sysroot=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot -MMD -MF obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o.d -DUSE_UDEV -DUSE_AURA=1 -DUSE_GLIB=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_56 -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_56 -DBASE_USE_PERFETTO_CLIENT_LIBRARY=1 -DGOOGLE_PROTOBUF_NO_RTTI -DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER -DGOOGLE_PROTOBUF_INTERNAL_DONATE_STEAL_INLINE=0 -DHAVE_PTHREAD -I../chromium-114.0.5696.0 -Igen -I../chromium-114.0.5696.0/third_party/distributed_point_functions -I../chromium-114.0.5696.0/third_party/distributed_point_functions/code -Igen/third_party/distributed_point_functions -I../chromium-114.0.5696.0/third_party/perfetto/include -Igen/third_party/perfetto/build_config -Igen/third_party/perfetto -I../chromium-114.0.5696.0/third_party/protobuf/src -Igen/protoc_out -I../chromium-114.0.5696.0/third_party/abseil-cpp -I../chromium-114.0.5696.0/third_party/highway/src -I../chromium-114.0.5696.0/third_party/boringssl/src/include -fno-ident -fno-strict-aliasing --param=ssp-buffer-size=4 -fstack-protector -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -mbranch-protection=standard -O2 -fdata-sections -ffunction-sections -fno-omit-frame-pointer -gdwarf-4 -g1 -fvisibility=hidden -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wno-deprecated-declarations -Wno-comments -Wno-packed-not-aligned -Wno-missing-field-initializers -Wno-unused-parameter -Wno-psabi -I/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot/usr/include/glib-2.0 -I/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot/usr/lib/glib-2.0/include -std=gnu++2a -fno-exceptions -fno-rtti -fvisibility-inlines-hidden -Wno-narrowing -Wno-class-memaccess -feliminate-unused-debug-types -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/chromium-114.0.5696.0=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/chromium-114.0.5696.0=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/build=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/build=/usr/src/debug/chromium-dev/114.0.5696.0-r0 -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot= -fmacro-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot= -fdebug-prefix-map=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot-native= -fvisibility-inlines-hidden -c ../chromium-114.0.5696.0/third_party/distributed_point_functions/code/dpf/internal/evaluate_prg_hwy.cc -o obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o #### Compiler output {standard input}: Assembler messages: {standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive aarch64-poky-linux-g++: fatal error: Killed signal terminated program cc1plus compilation terminated. #### Preprocessed files evaluate_prg_hwy.ii (attached)
Created attachment 54858 [details] evaluate_prg_hwy.ii (compressed with gzip)
I am getting the feeling there is an infinite loop between some two different folding. Maybe even caused by my r12-5430-g74faa9834a9ad2 . #5 0x0000000000d7e764 in gimple_build_with_ops_stat (num_ops=3, subcode=100, code=GIMPLE_ASSIGN) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple.cc:469 #6 gimple_build_assign_1 (op3=0x0, op2=0xffffef193810, op1=0xffffdaf27860, subcode=BIT_AND_EXPR, lhs=0xfff8ee62ed18) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple.cc:469 #7 gimple_build_assign (lhs=lhs@entry=0xfff8ee62ed18, subcode=subcode@entry=BIT_AND_EXPR, op1=0xffffdaf27860, op2=0xffffef193810, op3=op3@entry=0x0) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple.cc:496 #8 0x00000000015a2b40 in maybe_push_res_to_seq (res_op=0xffffffffe670, seq=0xfffffffff160, res=0xfff8ee62ed18) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match.h:303 #9 0x0000000001717e10 in gimple_simplify_BIT_AND_EXPR (res_op=res_op@entry=0xffffffffe700, seq=0xfffffffff160, valueize=0x12366c0 <fwprop_ssa_val(tree)>, type=0xfffff59f07e0, _p0=0xfff8ee62ecd0, _p1=0xffffef193810, code=...) at gimple-match.cc:194111 #10 0x0000000001601ab4 in gimple_simplify (res_op=res_op@entry=0xffffffffe700, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, code=..., type=<optimized out>, _p0=<optimized out>, _p1=<optimized out>) at gimple-match.cc:211334 #11 0x00000000016038d0 in gimple_resimplify2 (seq=0xfffffffff160, res_op=0xffffffffe980, valueize=0x12366c0 <fwprop_ssa_val(tree)>) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match-head.cc:323 #12 0x0000000001678b4c in gimple_simplify_222 (res_op=res_op@entry=0xffffffffe980, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, type=type@entry=0xfffff59f07e0, captures=captures@entry=0xffffffffe8f0, op=op@entry=BIT_IOR_EXPR, rop=rop@entry=BIT_AND_EXPR) at gimple-match.cc:54338 #13 0x00000000017af1a8 in gimple_simplify_BIT_IOR_EXPR (res_op=res_op@entry=0xffffffffe980, seq=0xfffffffff160, valueize=0x12366c0 <fwprop_ssa_val(tree)>, type=0xfffff59f07e0, _p0=0xfff8ee62ec40, _p1=0xfff8ee62ec88, code=...) at gimple-match.cc:115608 #14 0x0000000001601a6c in gimple_simplify (res_op=res_op@entry=0xffffffffe980, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, code=..., type=<optimized out>, _p0=<optimized out>, _p1=<optimized out>) at gimple-match.cc:211248 #15 0x00000000016038d0 in gimple_resimplify2 (seq=0xfffffffff160, res_op=0xffffffffeb60, valueize=0x12366c0 <fwprop_ssa_val(tree)>) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match-head.cc:323 #16 0x0000000001717e98 in gimple_simplify_BIT_AND_EXPR (res_op=res_op@entry=0xffffffffeb60, seq=0xfffffffff160, valueize=0x12366c0 <fwprop_ssa_val(tree)>, type=0xfffff59f07e0, _p0=0xfff8ee62ebf8, _p1=0xffffef193810, code=...) at gimple-match.cc:194125 #17 0x0000000001601ab4 in gimple_simplify (res_op=res_op@entry=0xffffffffeb60, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, code=..., type=<optimized out>, _p0=<optimized out>, _p1=<optimized out>) at gimple-match.cc:211334 #18 0x00000000016038d0 in gimple_resimplify2 (seq=0xfffffffff160, res_op=0xffffffffede0, valueize=0x12366c0 <fwprop_ssa_val(tree)>) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match-head.cc:323 #19 0x0000000001678b4c in gimple_simplify_222 (res_op=res_op@entry=0xffffffffede0, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, type=type@entry=0xfffff59f07e0, captures=captures@entry=0xffffffffed50, op=op@entry=BIT_IOR_EXPR, rop=rop@entry=BIT_AND_EXPR) at gimple-match.cc:54338 #20 0x00000000017af1a8 in gimple_simplify_BIT_IOR_EXPR (res_op=res_op@entry=0xffffffffede0, seq=0xfffffffff160, valueize=0x12366c0 <fwprop_ssa_val(tree)>, type=0xfffff59f07e0, _p0=0xfff8ee62eb68, _p1=0xfff8ee62ebb0, code=...) at gimple-match.cc:115608 #21 0x0000000001601a6c in gimple_simplify (res_op=res_op@entry=0xffffffffede0, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, code=..., type=<optimized out>, _p0=<optimized out>, _p1=<optimized out>) at gimple-match.cc:211248 #22 0x00000000016038d0 in gimple_resimplify2 (seq=0xfffffffff160, res_op=0xffffffffefc0, valueize=0x12366c0 <fwprop_ssa_val(tree)>) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match-head.cc:323 #23 0x0000000001717e98 in gimple_simplify_BIT_AND_EXPR (res_op=res_op@entry=0xffffffffefc0, seq=0xfffffffff160, valueize=0x12366c0 <fwprop_ssa_val(tree)>, type=0xfffff59f07e0, _p0=0xfff8ee62eb20, _p1=0xffffef193810, code=...) at gimple-match.cc:194125 #24 0x0000000001601ab4 in gimple_simplify (res_op=res_op@entry=0xffffffffefc0, seq=seq@entry=0xfffffffff160, valueize=valueize@entry=0x12366c0 <fwprop_ssa_val(tree)>, code=..., type=<optimized out>, _p0=<optimized out>, _p1=<optimized out>) at gimple-match.cc:211334 #25 0x00000000016038d0 in gimple_resimplify2 (seq=0xfffffffff160, res_op=0xfffffffff170, valueize=0x12366c0 <fwprop_ssa_val(tree)>) at /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/gimple-match-head.cc:323
/* (X & Y) & Y -> X & Y (X | Y) | Y -> X | Y */ (for op (bit_and bit_ior) (simplify (op:c (convert1?@2 (op:c @0 @@1)) (convert2? @1)) @2)) ... /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ (for op (bit_and bit_ior bit_xor) rop (bit_ior bit_and bit_and) (simplify (op (convert? (rop:c @@0 @1)) (convert? (rop:c @0 @2))) (if (tree_nop_conversion_p (type, TREE_TYPE (@1)) && tree_nop_conversion_p (type, TREE_TYPE (@2))) (rop (convert @0) (op (convert @1) (convert @2)))))) ... /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ (simplify (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
_4347 = POLY_INT_CST [16, 16] & 15; _4348 = _3441 | POLY_INT_CST [16, 16]; So this is with SVE.
Simple reduced testcase: ``` #include <arm_sve.h> unsigned long f(unsigned long tt) { unsigned long t= svcntb(); return (tt | 15) & t; } ``` Confirmed. Fails even in GCC 11.1.0. It is due to match.pd's /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ (simplify (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) (bit_ior (bit_and @0 @2) (bit_and @1 @2))) POLY_INT_CST is a CONSTANT_CLASS but does not simplify on the bit_and. So maybe it should include a ! on the (bit_and @1 @2) .
(In reply to Andrew Pinski from comment #7) > Fails even in GCC 11.1.0. > It is due to match.pd's > > /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ > (simplify > (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) > (bit_ior (bit_and @0 @2) (bit_and @1 @2))) > > POLY_INT_CST is a CONSTANT_CLASS but does not simplify on the bit_and. > So maybe it should include a ! on the (bit_and @1 @2) . Which then will go into a loop with: (A | B) & (A | C) into A | (B & C)
(In reply to Andrew Pinski from comment #8) > (In reply to Andrew Pinski from comment #7) > > Fails even in GCC 11.1.0. > > It is due to match.pd's > > > > /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */ > > (simplify > > (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2) > > (bit_ior (bit_and @0 @2) (bit_and @1 @2))) > > > > POLY_INT_CST is a CONSTANT_CLASS but does not simplify on the bit_and. > > So maybe it should include a ! on the (bit_and @1 @2) . > > Which then will go into a loop with: > (A | B) & (A | C) into A | (B & C) We expect these to always fold :/ If they don't for POLY_INTs then maybe add a (if (!POLY_INT_CST_P (...) guard. There are many patterns that are affected by this.
Might be a daft question, but which cases besides INTEGER_CST are supposed to be captured by the CONSTANT_CLASS_P?
(In reply to rsandifo@gcc.gnu.org from comment #10) > Might be a daft question, but which cases besides > INTEGER_CST are supposed to be captured by the CONSTANT_CLASS_P? For bit_and/bit_ior, VECTOR_CST (I would assume).
(In reply to Andrew Pinski from comment #11) > For bit_and/bit_ior, VECTOR_CST (I would assume). Ah, yeah. But then I don't think a top-level POLY_INT_CST_P cuts it. We'd have the same problem with VECTOR_CSTs containing POLY_INT_CSTs.
(In reply to rsandifo@gcc.gnu.org from comment #12) > (In reply to Andrew Pinski from comment #11) > > For bit_and/bit_ior, VECTOR_CST (I would assume). > Ah, yeah. But then I don't think a top-level POLY_INT_CST_P > cuts it. We'd have the same problem with VECTOR_CSTs containing > POLY_INT_CSTs. Do we really support that?
On Mon, 17 Apr 2023, rsandifo at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109505 > > --- Comment #10 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> --- > Might be a daft question, but which cases besides > INTEGER_CST are supposed to be captured by the CONSTANT_CLASS_P? For the bitops? I suppose FIXED_CST, VECTOR_CST, COMPLEX_CST (for _Complex int), basically all constants for which bitops are valid.
(In reply to Jakub Jelinek from comment #13) > (In reply to rsandifo@gcc.gnu.org from comment #12) > > (In reply to Andrew Pinski from comment #11) > > > For bit_and/bit_ior, VECTOR_CST (I would assume). > > Ah, yeah. But then I don't think a top-level POLY_INT_CST_P > > cuts it. We'd have the same problem with VECTOR_CSTs containing > > POLY_INT_CSTs. > Do we really support that? Sure. At least AIUI, VECTOR_CST can contain whatever constants the associated scalar supports. A specific example is: #include <arm_sve.h> svint32_t f() { return svdup_s32(svcntw()); } which gives: return { POLY_INT_CST [4, 4], ... }; https://godbolt.org/z/T1s3n5Pfx
*** Bug 109794 has been marked as a duplicate of this bug. ***
Is there by chance a workaround we can apply for this downstream (some flag)? It prevents building Chromium on arm64 for us w/ gcc unfortunately.
Created attachment 55124 [details] gcc14-pr109505.patch I actually think it isn't that bad, we don't have that many, I've looked at match.pd patterns which check for 2 CONSTANT_CLASS_P operands and then try to combine them using some operation and counted just 10 spots which I think need ! added.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:f211757f6fa9515e3fd1a4f66f1a8b48e500c9de commit r14-1023-gf211757f6fa9515e3fd1a4f66f1a8b48e500c9de Author: Jakub Jelinek <jakub@redhat.com> Date: Sun May 21 13:36:56 2023 +0200 atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505] On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P, but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) simplification actually relies on the (CST1 & CST2) simplification, otherwise it is a deoptimization, trading 2 ops for 3 and furthermore running into /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ simplification which simplifies that (x & CST2) | (CST1 & CST2) back to CST2 & (x | CST1). I went through all other places I could find where we have a simplification with 2 CONSTANT_CLASS_P operands and perform some operation on those two, while the other spots aren't that severe (just trade 2 operations for another 2 if the two constants don't simplify, rather than as in the above case trading 2 ops for 3), I still think all those spots really intend to optimize only if the 2 constants simplify. So, the following patch adds to those a ! modifier to ensure that, even at GENERIC that modifier means !EXPR_P which is exactly what we want IMHO. 2023-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109505 * match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2), Combine successive equal operations with constants, (A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P operands. * gcc.target/aarch64/sve/pr109505.c: New test.
The releases/gcc-13 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:0feece18e6993d02f24a9381ddb5420bb4509554 commit r13-7365-g0feece18e6993d02f24a9381ddb5420bb4509554 Author: Jakub Jelinek <jakub@redhat.com> Date: Sun May 21 13:36:56 2023 +0200 atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505] On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P, but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) simplification actually relies on the (CST1 & CST2) simplification, otherwise it is a deoptimization, trading 2 ops for 3 and furthermore running into /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ simplification which simplifies that (x & CST2) | (CST1 & CST2) back to CST2 & (x | CST1). I went through all other places I could find where we have a simplification with 2 CONSTANT_CLASS_P operands and perform some operation on those two, while the other spots aren't that severe (just trade 2 operations for another 2 if the two constants don't simplify, rather than as in the above case trading 2 ops for 3), I still think all those spots really intend to optimize only if the 2 constants simplify. So, the following patch adds to those a ! modifier to ensure that, even at GENERIC that modifier means !EXPR_P which is exactly what we want IMHO. 2023-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109505 * match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2), Combine successive equal operations with constants, (A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P operands. * gcc.target/aarch64/sve/pr109505.c: New test. (cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)
The releases/gcc-12 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:6ef4e2e11c653f1d51f9a304a8d1cf44a53b4ad7 commit r12-9634-g6ef4e2e11c653f1d51f9a304a8d1cf44a53b4ad7 Author: Jakub Jelinek <jakub@redhat.com> Date: Sun May 21 13:36:56 2023 +0200 atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505] On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P, but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) simplification actually relies on the (CST1 & CST2) simplification, otherwise it is a deoptimization, trading 2 ops for 3 and furthermore running into /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ simplification which simplifies that (x & CST2) | (CST1 & CST2) back to CST2 & (x | CST1). I went through all other places I could find where we have a simplification with 2 CONSTANT_CLASS_P operands and perform some operation on those two, while the other spots aren't that severe (just trade 2 operations for another 2 if the two constants don't simplify, rather than as in the above case trading 2 ops for 3), I still think all those spots really intend to optimize only if the 2 constants simplify. So, the following patch adds to those a ! modifier to ensure that, even at GENERIC that modifier means !EXPR_P which is exactly what we want IMHO. 2023-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109505 * match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2), Combine successive equal operations with constants, (A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P operands. * gcc.target/aarch64/sve/pr109505.c: New test. (cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)
Fixed for 12.4, 13.2 and 14.1.
Are there any plans to backport this fix to the gcc-11 branch as well? Seems it is affected, if you go by the known to fail list.
The fix doesn't really work for 11/10: ../../gcc/match.pd:1546:36 error: forcing simplification to a leaf is not supported for GENERIC (bit_ior (bit_and @0 @2) (bit_and! @1 @2))) ^ So, we'd either have to guard those patterns with #ifndef GENERIC, but that would be quite risky change this late on those branches, or perhaps could do something like #ifdef GENERIC (with { tree a = fold_binary (BIT_AND_EXPR, type, @1, @2); } (if (a && CONSTANT_CLASS_P (a)) (bit_ior (bit_and @0 @2) { a; }))) #else (bit_ior (bit_and @0 @2) (bit_and! @1 @2))) #endif But for all those spots the patch changed. I'm afraid I really don't have time to do that.
I'm testing a backport.
The releases/gcc-11 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:bfa476528ceeac96865a48c49f3f1a15d566d209 commit r11-10840-gbfa476528ceeac96865a48c49f3f1a15d566d209 Author: Richard Biener <rguenther@suse.de> Date: Wed Feb 23 13:47:01 2022 +0100 middle-end/109505 - backport match.pd ! support for GENERIC The patch adds support for the ! modifier to GENERIC, backported from r12-7361-gfdc46830f1b793. 2023-06-02 Richard Biener <rguenther@suse.de> PR tree-optimization/109505 * doc/match-and-simplify.texi: Amend ! documentation. * genmatch.c (expr::gen_transform): Code-generate ! support for GENERIC. (parser::parse_expr): Allow ! for GENERIC.
The releases/gcc-11 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:ca4a4cc0060cb8ae1a326d6dbfcd9459452e1574 commit r11-10841-gca4a4cc0060cb8ae1a326d6dbfcd9459452e1574 Author: Jakub Jelinek <jakub@redhat.com> Date: Sun May 21 13:36:56 2023 +0200 match.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505] On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P, but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) simplification actually relies on the (CST1 & CST2) simplification, otherwise it is a deoptimization, trading 2 ops for 3 and furthermore running into /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ simplification which simplifies that (x & CST2) | (CST1 & CST2) back to CST2 & (x | CST1). I went through all other places I could find where we have a simplification with 2 CONSTANT_CLASS_P operands and perform some operation on those two, while the other spots aren't that severe (just trade 2 operations for another 2 if the two constants don't simplify, rather than as in the above case trading 2 ops for 3), I still think all those spots really intend to optimize only if the 2 constants simplify. So, the following patch adds to those a ! modifier to ensure that, even at GENERIC that modifier means !EXPR_P which is exactly what we want IMHO. 2023-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109505 * match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2), Combine successive equal operations with constants, (A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P operands. * gcc.target/aarch64/sve/pr109505.c: New test. (cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)
Fixed.