__attribute__((noinline, noclone)) int foo (int a, int b) { if (a) return -3; asm volatile goto ("bts $1, %0; jc %l[lab]" : : "m" (b) : "memory" : lab); return 0; lab: return 0; } int main () { if (foo (1, 0) != -3 || foo (0, 3) != 0 || foo (0, 0) != 0) __builtin_abort (); return 0; } is miscompiled, *.optimized dump looks fine, but expansion looks wrong, perhaps during expansion we don't handle the case of at least one asm goto labels pointing to the fallthru block.
I see multiple issues: 1) commit_one_edge_insertion doesn't seem to cope with asm goto, I think asm goto with only labels pointing to the fallthru bb's labels can happen both for the single_pred_p (e->dest) case, as well as the single_succ_p case. JUMP_P in that case would be true, but we certainly can't insert in that case before the jump (aka asm goto). 2) during expansion, we apparently emit some further insns (some noop pseudo copy and unconditional jump) directly after asm goto into the same bb, then commit these edge insertions (so commit_one_edge_insertion actually doesn't see the asm goto at the end of the bb, but an unconditional jump) and only later on split the bb in find_many_sub_basic_blocks.
Created attachment 30974 [details] gcc49-pr58670.patch Untested fix.
Author: jakub Date: Thu Oct 10 16:29:50 2013 New Revision: 203383 URL: http://gcc.gnu.org/viewcvs?rev=203383&root=gcc&view=rev Log: PR middle-end/58670 * stmt.c (expand_asm_operands): Add FALLTHRU_BB argument, if any labels are in FALLTHRU_BB, use a special label emitted immediately after the asm goto insn rather than label_rtx of the LABEL_DECL. (expand_asm_stmt): Adjust caller. * cfgrtl.c (commit_one_edge_insertion): Force splitting of edge if the last insn in predecessor is a jump with single successor, but it isn't simplejump_p. * gcc.dg/torture/pr58670.c: New test. Added: trunk/gcc/testsuite/gcc.dg/torture/pr58670.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgrtl.c trunk/gcc/stmt.c trunk/gcc/testsuite/ChangeLog
Author: jakub Date: Thu Oct 10 16:39:52 2013 New Revision: 203384 URL: http://gcc.gnu.org/viewcvs?rev=203384&root=gcc&view=rev Log: PR middle-end/58670 * stmt.c (expand_asm_operands): Add FALLTHRU_BB argument, if any labels are in FALLTHRU_BB, use a special label emitted immediately after the asm goto insn rather than label_rtx of the LABEL_DECL. (expand_asm_stmt): Adjust caller. * cfgrtl.c (commit_one_edge_insertion): Force splitting of edge if the last insn in predecessor is a jump with single successor, but it isn't simplejump_p. * gcc.dg/torture/pr58670.c: New test. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr58670.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/cfgrtl.c branches/gcc-4_8-branch/gcc/stmt.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
What's the status here? While there was a checkin (r203383 & r203384), this sample code still does not work correctly (4.9.0 x86_64-w64-mingw32). The fall-thru case works, but if the jump inside the asm is taken, eax does not get set to zero when it clearly should. My -O2 output is: 00000000004014e0 <foo>: 4014e0: test ecx,ecx 4014e2: mov DWORD PTR [rsp+0x10],edx 4014e6: jne 4014f3 <foo+0x13> 4014e8: bts DWORD PTR [rsp+0x10],0x1 4014ee: jb 401500 <foo+0x20> 4014f0: xor eax,eax 4014f2: ret 4014f3: mov eax,0xfffffffd 4014f8: nop DWORD PTR [rax+rax*1+0x0] 401500: repz ret 401502: data32 data32 data32 data32 nop WORD PTR cs:[rax+rax*1+0x0]
Unsetting target milestone of open non-regression bug from version of branch being closed.
Hmm, recent tests with trunk (4.10) doesn't show the issues described in comment #5.
I can confirm what Kai had to say in comment 7. Using a newer build resolves both the original problem reported by Jakub and the problem I saw in comment 5.
I've tested additional the 4.8.x branch, and it is fixed there too. So issue seems to be fixed on all open branches. Therefore close as fixed.
Specifically, it seems this was fixed in the gcc-4.8.2 timeframe. This shows no issue in gcc-4.8.2, but the issue from comment #5 in gcc-4.8.1. It seems the Linux kernel works around this with a macro called `asm_volatile_goto` that simply inserts an empty asm block `asm("");` after an `asm goto`, which even in this reproducer fixes the problem. https://godbolt.org/z/ZUDRnU