This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [AArch64] support -mfentry feature for arm64
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: AKASHI Takahiro <takahiro dot akashi at linaro dot org>, Alexander Monakov <amonakov at ispras dot ru>
- Cc: Michael Matz <matz at suse dot de>, Maxim Kuvyrkov <maxim dot kuvyrkov at linaro dot org>, Li Bin <huawei dot libin at huawei dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>, <richard dot earnshaw at arm dot com>, <andrew dot wafaa at arm dot com>, <geoff at infradead dot org>, <guohanjun at huawei dot com>, <felix dot yang at huawei dot com>, <jiangjiji at huawei dot com>, nd <nd at arm dot com>
- Date: Wed, 20 Apr 2016 17:44:53 +0100
- Subject: Re: [PATCH] [AArch64] support -mfentry feature for arm64
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <1457943260-30894-1-git-send-email-huawei dot libin at huawei dot com> <B6E2AECD-096C-489F-A217-8140E781E722 at linaro dot org> <alpine dot LSU dot 2 dot 20 dot 1604151739210 dot 20277 at wotan dot suse dot de> <alpine dot LNX dot 2 dot 20 dot 1604152026570 dot 26109 at monopod dot intra dot ispras dot ru> <alpine dot LNX dot 2 dot 20 dot 1604171752300 dot 26109 at monopod dot intra dot ispras dot ru> <alpine dot LSU dot 2 dot 20 dot 1604181408270 dot 20277 at wotan dot suse dot de> <20160419062845 dot GC13600 at linaro dot org> <alpine dot LNX dot 2 dot 20 dot 1604190936040 dot 5544 at monopod dot intra dot ispras dot ru> <20160420012553 dot GB1234 at linaro dot org>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
On 20/04/16 02:25, AKASHI Takahiro wrote:
> On Tue, Apr 19, 2016 at 09:39:39AM +0300, Alexander Monakov wrote:
>> On Tue, 19 Apr 2016, AKASHI Takahiro wrote:
>>>>> But if Szabolcs' two-instruction
>>>>> sequence in the adjacent subthread is sufficient, this is moot.
>>>>
>>>> . It can also be solved by having just one NOP after the function label,
>>>> and a number of them before, then no thread can be in the nop pad. That
>>>> seems to indicate that GCC should not try to be too clever and simply
>>>> leave the specified number of nops before and after the function label,
>>>> leaving safety measures to the patching infrastructure.
>>>
>>> I don't get this idea very well.
>>> How can the instructions *before* a function label be executed
>>> after branching into this function?
>>
>> The single nop after the function label is changed to a short backwards branch
>> to the instructions just before the function label.
>>
>> As a result, the last instruction in the pad would have to become a short
>> forward branch jumping over the backwards branch described above, to the first
>> real instruction of the function.
>
> So you mean something like:
> 1:
> str x30, [sp, #-8]!
> bl _tracefunc
> ldr x30, [sp], #8
> b 2f
> .global <function label>
> b 1b
> 2:
> <function prologue/body>
> ...
> (We will not have to use x9 or else to preserve x30 here.)
>
> Interesting.
> Livepatch code in the kernel has an assumption that the address of
> "bl _tracefunc" be equal to <function label>, but a recent patch for
> power pc to support livepatch tries to ease this restriction [1],
> and so hopefully it won't be an issue.
> (I will have to dig into the kernel code to be sure that there is
> no other issues though.)
>
i think ldr x30,[sp],#8 after the _tracefunc is not ok for
livepatching, since _tracefunc will change the return
address to the new function to hijack the call, which will
not restore the stack (this can be solved if the new
function can be instrumented, but fiddly).
and sp has to be 16 byte aligned, so the options are
str x30,[sp,#-16]!
bl _tracefunc
or
mov x9,x30
bl _tracefunc
where _tracefunc is responsible for restoring x30 and
sp, and this sequence can come before or after the
function symbol.
if it's before then
1:
<save x30>
bl _tracefunc
b 2f
func:
b 1b
2:
<prologue>
the trace disabled case is better (only one nop), but i
think it would mean more kernel work (the current
code assumes bl _tracefunc is nopped, so whenever
tracing is enabled a different tracefunc target may be
used in the atomic update, i don't know if this is
necessary though).
it is probably only worth inventing something new for
aarch64 in gcc if the kernel can use that consistently
across targets or if that can cover other significant
use cases, but it's not clear if the various flexible nop
padding solutions can be more useful than the simple
two instruction sequence which kernel tools can already
deal with.
so it seems to me that
func:
mov x9, x30
bl __fentry__
<prologue>
is still the best option with a new -mfentry option for
aarch64 (then we can keep the default -pg behaviour
for backward compatibility and work similarly to x86
with -mfentry) it does not solve the more general
instrumentation problem, but that would require more
analysis.
(on x86, gcc also provides -mrecord-mcount and
-mnop-mcount to record the noped out mcount call
sites, but the kernel seems to use its own tool
to do that by looking for the mcount/fentry call
relocs so they are probably not needed).
> Thanks,
> -Takahiro AKASHI
>
> [1] http://lkml.iu.edu//hypermail/linux/kernel/1604.1/04111.html and
> http://lkml.iu.edu//hypermail/linux/kernel/1604.1/04112.html
>
>> Alexander
>