Bug 58670 - asm goto miscompilation
Summary: asm goto miscompilation
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.8.1
: P3 normal
Target Milestone: 5.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2013-10-09 18:12 UTC by Jakub Jelinek
Modified: 2022-01-04 21:28 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-10-10 00:00:00


Attachments
gcc49-pr58670.patch (2.51 KB, patch)
2013-10-10 07:36 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2013-10-09 18:12:33 UTC
__attribute__((noinline, noclone)) int
foo (int a, int b)
{
  if (a)
    return -3;
  asm volatile goto ("bts $1, %0; jc %l[lab]" : : "m" (b) : "memory" : lab);
  return 0;
lab:
  return 0;
}

int
main ()
{
  if (foo (1, 0) != -3 || foo (0, 3) != 0 || foo (0, 0) != 0)
    __builtin_abort ();
  return 0;
}

is miscompiled, *.optimized dump looks fine, but expansion looks wrong, perhaps during expansion we don't handle the case of at least one asm goto labels pointing to the fallthru block.
Comment 1 Jakub Jelinek 2013-10-09 19:21:57 UTC
I see multiple issues:
1) commit_one_edge_insertion doesn't seem to cope with asm goto,
   I think asm goto with only labels pointing to the fallthru bb's labels
   can happen both for the single_pred_p (e->dest) case, as well as the
   single_succ_p case.  JUMP_P in that case would be true, but we certainly
   can't insert in that case before the jump (aka asm goto).
2) during expansion, we apparently emit some further insns (some noop pseudo
   copy and unconditional jump) directly after asm goto into the same bb, then
   commit these edge insertions (so commit_one_edge_insertion actually doesn't
   see the asm goto at the end of the bb, but an unconditional jump) and only 
   later on split the bb in find_many_sub_basic_blocks.
Comment 2 Jakub Jelinek 2013-10-10 07:36:23 UTC
Created attachment 30974 [details]
gcc49-pr58670.patch

Untested fix.
Comment 3 Jakub Jelinek 2013-10-10 16:29:53 UTC
Author: jakub
Date: Thu Oct 10 16:29:50 2013
New Revision: 203383

URL: http://gcc.gnu.org/viewcvs?rev=203383&root=gcc&view=rev
Log:
	PR middle-end/58670
	* stmt.c (expand_asm_operands): Add FALLTHRU_BB argument,
	if any labels are in FALLTHRU_BB, use a special label emitted
	immediately after the asm goto insn rather than label_rtx
	of the LABEL_DECL.
	(expand_asm_stmt): Adjust caller.
	* cfgrtl.c (commit_one_edge_insertion): Force splitting of
	edge if the last insn in predecessor is a jump with single successor,
	but it isn't simplejump_p.

	* gcc.dg/torture/pr58670.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr58670.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgrtl.c
    trunk/gcc/stmt.c
    trunk/gcc/testsuite/ChangeLog
Comment 4 Jakub Jelinek 2013-10-10 16:39:54 UTC
Author: jakub
Date: Thu Oct 10 16:39:52 2013
New Revision: 203384

URL: http://gcc.gnu.org/viewcvs?rev=203384&root=gcc&view=rev
Log:
	PR middle-end/58670
	* stmt.c (expand_asm_operands): Add FALLTHRU_BB argument,
	if any labels are in FALLTHRU_BB, use a special label emitted
	immediately after the asm goto insn rather than label_rtx
	of the LABEL_DECL.
	(expand_asm_stmt): Adjust caller.
	* cfgrtl.c (commit_one_edge_insertion): Force splitting of
	edge if the last insn in predecessor is a jump with single successor,
	but it isn't simplejump_p.

	* gcc.dg/torture/pr58670.c: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr58670.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/cfgrtl.c
    branches/gcc-4_8-branch/gcc/stmt.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 5 David 2014-05-28 00:20:10 UTC
What's the status here?  While there was a checkin (r203383 & r203384), this sample code still does not work correctly (4.9.0 x86_64-w64-mingw32).  The fall-thru case works, but if the jump inside the asm is taken, eax does not get set to zero when it clearly should.

My -O2 output is:

00000000004014e0 <foo>:
  4014e0:	test   ecx,ecx
  4014e2:	mov    DWORD PTR [rsp+0x10],edx
  4014e6:	jne    4014f3 <foo+0x13>
  4014e8:	bts    DWORD PTR [rsp+0x10],0x1
  4014ee:	jb     401500 <foo+0x20>
  4014f0:	xor    eax,eax
  4014f2:	ret    
  4014f3:	mov    eax,0xfffffffd
  4014f8:	nop    DWORD PTR [rax+rax*1+0x0]
  401500:	repz ret 
  401502:	data32 data32 data32 data32 nop WORD PTR cs:[rax+rax*1+0x0]
Comment 6 Richard Biener 2014-06-12 13:52:34 UTC
Unsetting target milestone of open non-regression bug from version of branch being closed.
Comment 7 Kai Tietz 2014-06-13 08:05:18 UTC
Hmm, recent tests with trunk (4.10) doesn't show the issues described in comment #5.
Comment 8 David 2014-06-13 09:08:06 UTC
I can confirm what Kai had to say in comment 7.  Using a newer build resolves both the original problem reported by Jakub and the problem I saw in comment 5.
Comment 9 Kai Tietz 2014-06-13 11:05:30 UTC
I've tested additional the 4.8.x branch, and it is fixed there too.
So issue seems to be fixed on all open branches.  Therefore close as fixed.
Comment 10 Nick Desaulniers 2018-09-07 18:32:56 UTC
Specifically, it seems this was fixed in the gcc-4.8.2 timeframe.  This shows no issue in gcc-4.8.2, but the issue from comment #5 in gcc-4.8.1.  It seems the Linux kernel works around this with a macro called `asm_volatile_goto` that simply inserts an empty asm block `asm("");` after an `asm goto`, which even in this reproducer fixes the problem.

https://godbolt.org/z/ZUDRnU