This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/46219] Generate indirect jump instruction on x86-64
- From: "adam at consulting dot net.nz" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 05 Sep 2014 00:29:10 +0000
- Subject: [Bug target/46219] Generate indirect jump instruction on x86-64
- Auto-submitted: auto-generated
- References: <bug-46219-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46219
Adam Warner <adam at consulting dot net.nz> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Version|4.6.0 |4.9.1
Resolution|FIXED |---
--- Comment #6 from Adam Warner <adam at consulting dot net.nz> ---
Great work thanks Kai Tietz and Richard Henderson! I've come across a situation
where complex jmp is not generated and crafted a simplified test case:
$ cat gcc_bug_no_complex_indirect_jmp.c
#include <stdint.h>
typedef void (*fn0_t)(uint8_t *rdi);
typedef void (*fn1_t)(uint8_t *rdi, fn0_t *rsi);
fn0_t fn0_dispatch[256];
fn1_t fn1_dispatch[256];
void fn0_test(uint8_t *rdi) {
fn0_t *rsi = fn0_dispatch;
fn1_dispatch[rdi[1]](rdi, rsi);
}
int main(void) {
asm volatile ("ret; jmpq *0x601140(,%rax,8)");
return 0;
}
$ gcc --version
gcc (Debian 4.9.1-4) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gcc -O3 gcc_bug_no_complex_indirect_jmp.c && objdump -d -m i386:x86-64:intel
a.out |less
...
00000000004003c0 <main>:
4003c0: c3 ret
4003c1: ff 24 c5 40 11 60 00 jmp QWORD PTR [rax*8+0x601140]
...
00000000004004c0 <fn0_test>:
4004c0: 0f b6 47 01 movzx eax,BYTE PTR [rdi+0x1]
4004c4: be 40 09 60 00 mov esi,0x600940
4004c9: 48 8b 04 c5 40 11 60 mov rax,QWORD PTR [rax*8+0x601140]
4004d0: 00
4004d1: ff e0 jmp rax
...
The last two instructions should be merged into JMP QWORD PTR [rax*8+0x601140].
This is a 7 byte instruction. Fortuitously fn0_test would become 16 bytes total
(no more than 16 bytes of machine code can be decoded in one clock cycle on
Intel Core 2).