Bug 47253 - Conditional jump to tail function is not generated
Summary: Conditional jump to tail function is not generated
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.6.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
: 60159 69576 109844 119299 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-01-11 00:45 UTC by Zdenek Sojka
Modified: 2025-03-14 21:01 UTC (History)
11 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work:
Known to fail: 3.3.6, 3.4.6, 4.4.5, 4.6.0
Last reconfirmed: 2011-01-11 14:09:17


Attachments
A patch (2.44 KB, patch)
2025-02-10 07:39 UTC, H.J. Lu
Details | Diff
An updated patch (2.86 KB, patch)
2025-02-10 10:28 UTC, H.J. Lu
Details | Diff
An improved patch with tests (3.99 KB, patch)
2025-02-10 14:20 UTC, H.J. Lu
Details | Diff
An updated patch (5.21 KB, patch)
2025-02-11 02:16 UTC, H.J. Lu
Details | Diff
A new patch (5.19 KB, patch)
2025-02-11 06:54 UTC, H.J. Lu
Details | Diff
A patch to fold jump table with tests (2.83 KB, patch)
2025-02-11 06:55 UTC, H.J. Lu
Details | Diff
A patch to fold sibcall (5.17 KB, patch)
2025-02-11 11:05 UTC, H.J. Lu
Details | Diff
An updated patch to fold sibcall in jump table (3.29 KB, patch)
2025-02-11 11:06 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2011-01-11 00:45:23 UTC
I hope the summary is descriptive enough.
Take the following code:

----- testcase.c -----
void bar(void);

void foo(int c)
{
	if (c) bar();
}
----------------------

With -O3, gcc generated this code:
foo:
.LFB0:
	.cfi_startproc
	test	edi, edi	# c
	jne	.L4	#,
	rep
	ret
	.p2align 4,,10
	.p2align 3
.L4:
	jmp	bar	#
	.cfi_endproc


and with -Os:
foo:
.LFB0:
	.cfi_startproc
	test	edi, edi	# c
	je	.L1	#,
	jmp	bar	#
.L1:
	ret
	.cfi_endproc


while better would be:
foo:
	test	edi, edi
	jne	.L1
	rep # only without -Os
	ret

I tested 3.3.6, 3.4.6, 4.4.5, 4.6.0, neither generates the "better" code.
Comment 1 H.J. Lu 2011-01-11 03:16:32 UTC
(In reply to comment #0)
> I hope the summary is descriptive enough.
> Take the following code:
> 
> ----- testcase.c -----
> void bar(void);
> 
> void foo(int c)
> {
>     if (c) bar();
> }
..
> while better would be:
> foo:
>     test    edi, edi
>     jne    .L1
>     rep # only without -Os
>     ret
> 

Where is .L1?
Comment 2 Zdenek Sojka 2011-01-11 07:02:14 UTC
(In reply to comment #1)
> 
> Where is .L1?

Thanks, it should be:

foo:
    test    edi, edi
    jne    bar
    rep # only without -Os
    ret
Comment 3 H.J. Lu 2011-01-11 13:35:10 UTC
jne only takes 8bit displacement.
Comment 4 Zdenek Sojka 2011-01-11 14:04:13 UTC
(In reply to comment #3)
> jne only takes 8bit displacement.

There are two opcodes for jne - 0x75 taking 8bit displacement, and 0x0f 0x85 taking 16/32bit displacement:

(pasted from IA-32 Intel Architecture Software Developer’s Manual Volume 2: Instruction Set Reference)

75 cb
JNE rel8
Jump short if not equal (ZF=0)

0F 85 cw/cd
JNE rel16/32
Jump near if not equal (ZF=0)


Jcc is no different from JMP, both can take 8/(16/)32bit displacement - even in 64bit mode.
Comment 5 H.J. Lu 2017-10-24 21:08:31 UTC
*** Bug 69576 has been marked as a duplicate of this bug. ***
Comment 6 Andrew Pinski 2021-08-19 03:31:33 UTC
*** Bug 60159 has been marked as a duplicate of this bug. ***
Comment 7 Andrew Pinski 2023-05-13 17:51:07 UTC
*** Bug 109844 has been marked as a duplicate of this bug. ***
Comment 8 Jan Schultke 2025-02-08 09:28:31 UTC
Another repro: https://godbolt.org/z/K4zc6GMjY

Given the code:

> void t(), f();
> 
> void decide(bool ok) {
>     if (ok) {
>         t();
>     } else  {
>         f();
>     }
> }


GCC emits:

> decide(bool):
>         test    dil, dil
>         je      .L2
>         jmp     t()
> .L2:
>         jmp     f()


This contains a je to a jmp instruction, which is obviously redundant. The expected output (Clang, MSVC) is:

> decide(bool):
>         test    edi, edi
>         je      f()@PLT
>         jmp     t()@PLT
Comment 9 H.J. Lu 2025-02-10 07:39:26 UTC
Created attachment 60445 [details]
A patch

[hjl@gnu-tgl-3 pr47253]$ cat y.c
void t(), f();
 
void
decide(bool ok)
{
  if (ok)
    t();
  else
    f();
}
[hjl@gnu-tgl-3 pr47253]$ make y.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2 -S y.c
[hjl@gnu-tgl-3 pr47253]$ cat y.s
	.file	"y.c"
	.text
	.p2align 4
	.globl	decide
	.type	decide, @function
decide:
.LFB0:
	.cfi_startproc
	testb	%dil, %dil
	je	f
	jmp	t
	.p2align 4,,10
	.p2align 3
.L2:
	jmp	f
	.cfi_endproc
.LFE0:
	.size	decide, .-decide
	.ident	"GCC: (GNU) 15.0.1 20250210 (experimental)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 pr47253]$ 

But the unreachable basic block isn't removed.
Comment 10 H.J. Lu 2025-02-10 10:28:39 UTC
Created attachment 60446 [details]
An updated patch

This patch deletes the unreachable block.
Comment 11 H.J. Lu 2025-02-10 14:20:30 UTC
Created attachment 60448 [details]
An improved patch with tests
Comment 12 H.J. Lu 2025-02-11 02:16:06 UTC
Created attachment 60454 [details]
An updated patch

This patch passed "make bootstrap" with C/C++.
Comment 13 H.J. Lu 2025-02-11 06:54:16 UTC
Created attachment 60457 [details]
A new patch
Comment 14 H.J. Lu 2025-02-11 06:55:25 UTC
Created attachment 60458 [details]
A patch to fold jump table with tests

I got

[hjl@gnu-tgl-3 pr47253]$ cat j.c
void bar0 (void);
void bar1 (void);
void bar2 (void);
void bar3 (void);
void bar4 (void);

void
foo (int i)
{
  switch (i)
    {
    case 0: bar0 (); break;
    case 1: bar1 (); break;
    case 2: bar2 (); break;
    case 3: bar3 (); break;
    case 4: bar4 (); break;
    }
}
[hjl@gnu-tgl-3 pr47253]$ cat j.s
	.file	"j.c"
	.text
	.p2align 4
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	cmpl	$4, %edi
	ja	.L1
	movl	%edi, %edi
	jmp	*.L4(,%rdi,8)
	.section	.rodata
	.align 8
	.align 4
.L4:
	.quad	bar0
	.quad	bar1
	.quad	bar2
	.quad	bar3
	.quad	bar4
	.text
.L1:
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (GNU) 15.0.1 20250211 (experimental)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 pr47253]$
Comment 15 H.J. Lu 2025-02-11 11:05:23 UTC
Created attachment 60460 [details]
A patch to fold sibcall
Comment 16 H.J. Lu 2025-02-11 11:06:23 UTC
Created attachment 60461 [details]
An updated patch to fold sibcall in jump table
Comment 17 H.J. Lu 2025-02-11 23:23:18 UTC
My current patches are at

https://gitlab.com/x86-gcc/gcc/-/tree/users/hjl/condjmp/master?ref_type=heads
Comment 18 H.J. Lu 2025-02-23 11:26:18 UTC
My current patches are at

https://gitlab.com/x86-gcc/gcc/-/tree/users/hjl/condjmp/v7?ref_type=heads

They passed GCC bootstrap and tests on x86-64.
Comment 19 Andrew Pinski 2025-03-14 21:01:46 UTC
*** Bug 119299 has been marked as a duplicate of this bug. ***