Hi, I have code that looks like this: pannekake:~> cat test.c void foo(); void bar(); void baz(); void func(int i) { switch (i) { case 0: foo(); break; case 1: bar(); break; case 2: baz(); break; case 3: baz(); break; case 4: bar(); break; case 5: foo(); break; case 6: foo(); break; case 7: bar(); break; case 8: baz(); break; case 9: baz(); break; case 10: bar(); break; default: __builtin_unreachable(); break; } } Compiling this yields: pannekake:~> gcc-4.6 -O2 -c test.c && objdump --disassemble test.o test.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <func>: 0: 83 ff 0a cmp $0xa,%edi 3: 76 03 jbe 8 <func+0x8> 5: 0f 1f 00 nopl (%rax) 8: 89 ff mov %edi,%edi a: 31 c0 xor %eax,%eax c: ff 24 fd 00 00 00 00 jmpq *0x0(,%rdi,8) 13: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 18: e9 00 00 00 00 jmpq 1d <func+0x1d> 1d: 0f 1f 00 nopl (%rax) 20: e9 00 00 00 00 jmpq 25 <func+0x25> 25: 0f 1f 00 nopl (%rax) 28: e9 00 00 00 00 jmpq 2d <func+0x2d> The first compare is, as you can see, unneeded; the code for the default case itself (a repz ret) has been optimized away due to the __builtin_unreachable(), but the compare and branch remains. I've also seen it sometimes be able to remove the jump instruction itself, but not the compare.
Forgot this: pannekake:~> gcc-4.6 -v Using built-in specs. COLLECT_GCC=gcc-4.6 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.2-5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.6.2 (Debian 4.6.2-5)
Confirmed.
The optimization can either be done when emitting rtl in the expand_case function in stmt.c (and its callees) by first recognizing calls to __builtin_unreachable() at the default label, and then simply do not emit the compares and jumps to the default label. That task is quite easy. Another option would be to add more general code in the later cfg optimization passes for rtl that removes all jumps (and corresponding compares) to basic blocks only containing barriers. What do you think? This optimization would be very interesting for interpreter implementators that use a switch statement to dispatch the next instruction, when they can guarantee that the default branch is never taken.
This should be solved by allowing labels in trivially_empty_bb_p.
*** Bug 60362 has been marked as a duplicate of this bug. ***
*** Bug 65922 has been marked as a duplicate of this bug. ***
(In reply to Emil L from comment #3) > This optimization would be very interesting for interpreter implementators > that use a switch statement to dispatch the next instruction, when they can > guarantee that the default branch is never taken. I have actually hit the same issue with some code from PHP, so you're not too far off on your comment. A reduced test case from PHP looks like: void foo (unsigned char *ptr, unsigned int cond) { switch (cond) { case 0: return; case 1: case 2: case 3: case 4: case 6: *ptr += 1; return; case 5: *ptr += 2; return; default: __builtin_unreachable (); break; } } In this case, the undeleted branch had a wild label that pointed to nowhere.
(In reply to Steven Bosscher from comment #4) > This should be solved by allowing labels in trivially_empty_bb_p. I tried the following, but it doesn't fix the problem. --- cfgcleanup.c (revision 222550) +++ cfgcleanup.c (working copy) @@ -2647,7 +2647,8 @@ trivially_empty_bb_p (basic_block bb) { if (insn == BB_HEAD (bb)) return true; - if (!DEBUG_INSN_P (insn)) + if (!(DEBUG_INSN_P (insn) + || (LABEL_P (insn) && !LABEL_PRESERVE_P (insn)))) return false; insn = PREV_INSN (insn); }
New patch submitted.
Fixed on trunk with revision 247844.
Closing as fixed.
Created attachment 48236 [details] testcase This bug reoccurred in gcc-8. gcc-7 and gcc-9+ seem not affected. $ gcc-8 -O3 -S b.c -o - [...] foo: .LFB0: .cfi_startproc cmpl $1, %edi jne .L10 movslq %esi, %rsi leaq foz(%rip), %rax movl (%rax,%rsi,4), %eax movl %eax, (%rdx) xorl %eax, %eax ret .p2align 4,,10 .p2align 3 .L10: cmpl $2, %edi jne .L11 .L3: movslq %esi, %rsi leaq baz(%rip), %rax movl (%rax,%rsi,4), %eax movl %eax, (%rdx) xorl %eax, %eax ret .p2align 4,,10 .p2align 3 .L11: testl %edi, %edi je .L3 .cfi_endproc [...] $ gcc-8 -O -S b.c -o - [...] foo: .LFB0: .cfi_startproc cmpl $1, %edi je .L2 cmpl $2, %edi je .L3 testl %edi, %edi jne .L4 .L3: movslq %esi, %rsi leaq baz(%rip), %rax movl (%rax,%rsi,4), %eax movl %eax, (%rdx) .L5: movl $0, %eax ret .L2: movslq %esi, %rsi leaq foz(%rip), %rax movl (%rax,%rsi,4), %eax movl %eax, (%rdx) jmp .L5 .L4: .cfi_endproc
Confirmed, I see the same issue with the test case in Comment #12 on powerpc64le-linux using gcc-8. I also see the original problem with the original test case using gcc-8 as well.