This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[RFC][AArch64] function prologue analyzer in linux kernel
- From: AKASHI Takahiro <takahiro dot akashi at linaro dot org>
- To: GCC Development <gcc at gcc dot gnu dot org>, Will Deacon <Will dot Deacon at arm dot com>
- Date: Thu, 24 Dec 2015 16:57:54 +0900
- Subject: [RFC][AArch64] function prologue analyzer in linux kernel
- Authentication-results: sourceware.org; auth=none
Hi,
I'm the author of ftrace support on arm64(aarch64) linux. As part of
ftrace, we can utilize "stack tracer" which reports the maximum usage
of kernel stack:
---8<---
# cat /sys/kernel/debug/tracing/stack_max_size
4088
# cat /sys/kernel/debug/tracing/stack_trace
Depth Size Location (49 entries)
----- ---- --------
0) 4088 16 __local_bh_enable_ip+0x18/0xd8
1) 4072 32 _raw_read_unlock_bh+0x38/0x48
2) 4040 32 xs_udp_write_space+0x44/0x50
3) 4008 32 sock_wfree+0x88/0x90
4) 3976 32 skb_release_head_state+0x70/0xa0
[snip]
44) 808 32 load_elf_binary+0x29c/0x10d0
45) 776 224 search_binary_handler+0xbc/0x208
46) 552 96 do_execveat_common.isra.15+0x4e4/0x690
47) 456 112 SyS_execve+0x4c/0x60
48) 344 344 el0_svc_naked+0x24/0x28
--->8---
Here, "Depth" (and hence "Size") is determined, after scanning a stack,
by saved fp pointer (more precisely + 0x10) in a stack frame instead
of (not saved) stack pointer. (Please note that arm64 kernel is always
compiled with -fno-omit-frame-pointer.)
As fp is updated after branching into a function, and allocates not only
a function's stack frame but also callee's local variables, using this
saved value of fp as "Depth", or sp of a caller function, is not
appropriate for calculating a stack size of a function.
So I'd like to introduce a function prologue analyzer to determine
a size allocated by a function's prologue and deduce it from "Depth".
My implementation of this analyzer has been submitted to
linux-arm-kernel mailing list[1].
I borrowed some ideas from gdb's analyzer[2], especially a loop of
instruction decoding as well as stop of decoding at exiting a basic block,
but implemented my own simplified one because gdb version seems to do
a bit more than what we expect here.
Anyhow, since it is somewhat heuristic (and may not be maintainable for
a long term), could you review it from a broader viewpoint of toolchain,
please?
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/393721.html
[2] aarch64_analyze_prologue() in gdb/aarch64-tdep.c
Thanks,
-Takahiro AKASHI