-fpatchable-function-entry will generate N NOPs at the beginning of each function. Observe the binary compiled by gcc, the function entry address is inconsistent with the value of DW_AT_low_pc in the corresponding DWARF data. I used a toy example to describe the issue; 1.Compile the source file toy_exam.c $ gcc -o toy_exam.gcc toy_exam.c -g -gdwarf-4 -fpatchable-function-entry=2 -save-temps 2.Check the symbolic address of the function fun_a $ readelf -s toy_exam.gcc |grep -w fun_a 95: 00000000000007f0 80 FUNC GLOBAL DEFAULT 13 fun_a 3.Display assembler contents objdump -d toy_exam.gcc |grep -A 8 -w \<fun_a\>: 00000000000007f0 <fun_a>: 7f0: d503201f nop 7f4: d503201f nop 7f8: a9be7bfd stp x29, x30, [sp, #-32]! 7fc: 910003fd mov x29, sp 800: 52800040 mov w0, #0x2 // #2 804: b90017e0 str w0, [sp, #20] 808: 528000a0 mov w0, #0x5 // #5 80c: b9001be0 str w0, [sp, #24] 4.dump dwarf info $ llvm-dwarfdump toy_exam.gcc |grep -C 10 -w fun_a 0x00000315: DW_TAG_subprogram DW_AT_external (true) DW_AT_name ("fun_a") DW_AT_decl_file ("/home/jianlin/code/test/toy_exam.c") DW_AT_decl_line (14) DW_AT_decl_column (0x06) DW_AT_low_pc (0x00000000000007f8) DW_AT_high_pc (0x0000000000000840) DW_AT_frame_base (DW_OP_call_frame_cfa) DW_AT_GNU_all_tail_call_sites (true) DW_AT_sibling (0x0000035d) 5. Assembler code fun_a: .section __patchable_function_entries .8byte .LPFE2 .text .LPFE2: nop nop .LFB7: .loc 1 15 1 .cfi_startproc stp x29, x30, [sp, -32]! .cfi_def_cfa_offset 32 .cfi_offset 29, -32 .cfi_offset 30, -24 mov x29, sp .loc 1 16 13 $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/gcc-trunk/libexec/gcc/aarch64-unknown-linux-gnu/11.0.0/lto-wrapper Target: aarch64-unknown-linux-gnu Configured with: ../gcc/configure --prefix=/usr/gcc-trunk --enable-languages=c,c++,fortran --disable-libquadmath --disable-libquadmath-support --disable-werror --disable-bootstrap --enable-gold Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.0.0 20210119 (experimental) (GCC) the first instruction in the compile unit indicated by DW_AT_low_pc does not include NOP. GCC-9, GCC-10 and the latest master branch were respectively tested, and the results were the same.
*** Bug 99836 has been marked as a duplicate of this bug. ***
Confirmed on today's master (June 29th - 2021).
Confirmed. void __attribute__((noipa)) foo() { } int main() { foo (); } > gcc-10 t.c -g -fpatchable-function-entry=16 -O > gdb ./a.out GNU gdb (GDB; SUSE Linux Enterprise 15) 10.1 ... (gdb) disassemble foo Dump of assembler code for function foo: 0x00000000004004a6 <+0>: ret End of assembler dump. (gdb) b foo Breakpoint 1 at 0x400496 (2 locations) so the symbol is at 0x400496 but low-pc is 0x4004a6 On trunk the FDE start/end labels look correct: foo: .LFB0: .cfi_startproc .section __patchable_function_entries,"awo",@progbits,foo .align 8 .quad .LPFE1 .text .LPFE1: nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop ret .cfi_endproc .LFE0: and it seems to work there. Quick verification shows it works with GCC 11+ but fails with GCC 10 which has foo: .section __patchable_function_entries,"aw",@progbits .align 8 .quad .LPFE1 .text .LPFE1: nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop .LFB0: .file 1 "t.c" .loc 1 1 35 view -0 .cfi_startproc .loc 1 1 37 view .LVU1 ret .cfi_endproc .LFE0: .size foo, .-foo
Martin, can you bisect what fixed it?
(In reply to Richard Biener from comment #4) > Martin, can you bisect what fixed it? Sure. Please help me how to verify what is a correct output? Isn't that related to DWARF 5 change done in GCC 11?
(In reply to Martin Liška from comment #5) > (In reply to Richard Biener from comment #4) > > Martin, can you bisect what fixed it? > > Sure. Please help me how to verify what is a correct output? Isn't that > related to DWARF 5 change done in GCC 11? It's consistent with -gdwarf-2 -gstrict-dwarf as well, so no. A broken executable will output a short disassembly from gdb: > gcc-10 t.c -g -O -fpatchable-function-entry=16 > gdb -ex 'disassemble foo' -batch ./a.out | wc -l 3 actual output is Dump of assembler code for function foo: 0x00000000004004a6 <+0>: ret End of assembler dump. where a correctly working one is > gdb -ex 'disassemble foo' -batch ./a.out | wc -l 19 with output Dump of assembler code for function foo: 0x0000000000400476 <+0>: nop 0x0000000000400477 <+1>: nop 0x0000000000400478 <+2>: nop 0x0000000000400479 <+3>: nop 0x000000000040047a <+4>: nop 0x000000000040047b <+5>: nop 0x000000000040047c <+6>: nop 0x000000000040047d <+7>: nop 0x000000000040047e <+8>: nop 0x000000000040047f <+9>: nop 0x0000000000400480 <+10>: nop 0x0000000000400481 <+11>: nop 0x0000000000400482 <+12>: nop 0x0000000000400483 <+13>: nop 0x0000000000400484 <+14>: nop 0x0000000000400485 <+15>: nop 0x0000000000400486 <+16>: ret End of assembler dump.
Fixed on master with r11-1245-g3dcea658c9e2ac84.
(In reply to Martin Liška from comment #7) > Fixed on master with r11-1245-g3dcea658c9e2ac84. OK, so that's target specific then, thus aarch64 could still be broken. assemble_start_function is the one invoking the target hook (and eventually its default implementation) that emits the patchable area.
Hi, is somebody working on fixing this on arm64? If not I will be working on it. The linux kernel needs this fixed for systemtap and perf probe.
Patch for arm64: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607601.html
The master branch has been updated by Sebastian Pop <spop@gcc.gnu.org>: https://gcc.gnu.org/g:09c91caeb84e7c3609a12a53b57e5219a1dd2b15 commit r13-4561-g09c91caeb84e7c3609a12a53b57e5219a1dd2b15 Author: Sebastian Pop <spop@amazon.com> Date: Wed Nov 30 19:45:24 2022 +0000 AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776] Currently patchable area is at the wrong place on AArch64. It is placed immediately after function label, before .cfi_startproc. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and modifies aarch64_print_patchable_function_entry to avoid placing patchable area before .cfi_startproc. gcc/ PR target/98776 * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area): Declared. * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry): Emit an UNSPECV_PATCHABLE_AREA pseudo instruction. (aarch64_output_patchable_area): New. * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New. (patchable_area): Define. gcc/testsuite/ PR target/98776 * gcc.target/aarch64/pr98776.c: New. * gcc.target/aarch64/pr92424-2.c: Adjust pattern. * gcc.target/aarch64/pr92424-3.c: Adjust pattern.
The releases/gcc-10 branch has been updated by Sebastian Pop <spop@gcc.gnu.org>: https://gcc.gnu.org/g:59bba6f9dc6dcfefe96e6fad677614f39928564e commit r10-11122-g59bba6f9dc6dcfefe96e6fad677614f39928564e Author: Sebastian Pop <spop@amazon.com> Date: Wed Nov 30 19:45:24 2022 +0000 AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776] Currently patchable area is at the wrong place on AArch64. It is placed immediately after function label, before .cfi_startproc. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and modifies aarch64_print_patchable_function_entry to avoid placing patchable area before .cfi_startproc. gcc/ PR target/98776 * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area): Declared. * config/aarch64/aarch64.c (aarch64_print_patchable_function_entry): Emit an UNSPECV_PATCHABLE_AREA pseudo instruction. (aarch64_output_patchable_area): New. * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New. (patchable_area): Define. gcc/testsuite/ PR target/98776 * gcc.target/aarch64/pr98776.c: New. * gcc.target/aarch64/pr92424-2.c: Adjust pattern. * gcc.target/aarch64/pr92424-3.c: Adjust pattern.
The releases/gcc-11 branch has been updated by Sebastian Pop <spop@gcc.gnu.org>: https://gcc.gnu.org/g:50f7161448a19c4fa355c7c652e26b47ceb36cc4 commit r11-10422-g50f7161448a19c4fa355c7c652e26b47ceb36cc4 Author: Sebastian Pop <spop@amazon.com> Date: Wed Nov 30 19:45:24 2022 +0000 AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776] Currently patchable area is at the wrong place on AArch64. It is placed immediately after function label, before .cfi_startproc. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and modifies aarch64_print_patchable_function_entry to avoid placing patchable area before .cfi_startproc. gcc/ PR target/98776 * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area): Declared. * config/aarch64/aarch64.c (aarch64_print_patchable_function_entry): Emit an UNSPECV_PATCHABLE_AREA pseudo instruction. (aarch64_output_patchable_area): New. * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New. (patchable_area): Define. gcc/testsuite/ PR target/98776 * gcc.target/aarch64/pr98776.c: New. * gcc.target/aarch64/pr92424-2.c: Adjust pattern. * gcc.target/aarch64/pr92424-3.c: Adjust pattern.
The releases/gcc-12 branch has been updated by Sebastian Pop <spop@gcc.gnu.org>: https://gcc.gnu.org/g:7525c9d7e72ac3818e08fe7aa98396bd41e4ec8c commit r12-8987-g7525c9d7e72ac3818e08fe7aa98396bd41e4ec8c Author: Sebastian Pop <spop@amazon.com> Date: Wed Nov 30 19:45:24 2022 +0000 AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776] Currently patchable area is at the wrong place on AArch64. It is placed immediately after function label, before .cfi_startproc. This patch adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and modifies aarch64_print_patchable_function_entry to avoid placing patchable area before .cfi_startproc. gcc/ PR target/98776 * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area): Declared. * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry): Emit an UNSPECV_PATCHABLE_AREA pseudo instruction. (aarch64_output_patchable_area): New. * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New. (patchable_area): Define. gcc/testsuite/ PR target/98776 * gcc.target/aarch64/pr98776.c: New. * gcc.target/aarch64/pr92424-2.c: Adjust pattern. * gcc.target/aarch64/pr92424-3.c: Adjust pattern.
Fixed for arm64 as well on master, and backported to active branches gcc-12, 11, and 10.