[PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

Tue Jan 21 11:52:00 GMT 2020

Hi Szabolcs,

Answers from a linux dev perspective below.

On Mon, Jan 20, 2020 at 10:53:33AM +0000, Szabolcs Nagy wrote:
> On 19/01/2020 08:53, FÄng-ruÃ¬ SÃ²ng via gcc-patches wrote:
> > It'd be great to have some tests, e.g.
> > 
> > 1. -fpatchable-function-entry=0 -mbranch-protection=bti
> > 2. -fpatchable-function-entry=2 -mbranch-protection=bti
> > 
> > I have updated clang to emitÂ Â  `.Lfunc_begin0: bti c; nop; nop` for case 2.
> > The __patchable_function_entries entry points to .Lfunc_begin0 (bti c).
> > 
> > (The change is not included in the llvm 10.0 branch.)
> 
> i have to ask some linux developers which way they prefer:
> 
> e.g. -fpatchable-function-entry=3,1 is
> 
>  .section __patchable_function_entries
>  .8byte .Lpatch
>  .text
> .Lpatch:
>   nop
> func:
>   nop
>   nop
>   ...
> 
> with bti the code will be emitted as:
> 
> .Lpatch:
>   nop
> func:
>   bti c
>   nop
>   nop
>   ...

That looks good to me.

> but e.g. -fpatchable-function-entry=2,0 has two reasonable
> approaches with bti:
> 
> (a)
> 
> func:
> .Lpatch:
>   bti c
>   nop
>   nop
>   ...
> 
> (b)
> 
> func:
>   bti c
> .Lpatch:
>   nop
>   nop
>   ...

I had assumed (b); that means that .Lpatch consistently points to the
first NOP. To my mental model, that seems more consistent than (a).

However, I can work with either so long as it's consistent.

> i think (a) is more consistent across fancy N,M settings
> (bti is always included into the patch area, user needs
> to know to skip it), but (b) is more compatible with
> existing usage (M=0 is i believe the common setting and
> with that or with M=N the patching code does not need to
> know about bti, existing patching code works unmodified).
> 
> current llvm fix does (a), proposed gcc fix does (b),
> i guess we have to pick one.
>
> (solution (a) is a bit messier in gcc, because currently
> there is no target hook between the emission of .Lpatch
> and the nops, i avoided refactoring that code to get a
> backend only fix that is easy to backport, but (a) is
> possible to do with a bit more changes.)

As above, my weak preference is (b), but I can work with either. I just
need the behaviour to be consistent.

Was there a rationale for LLVM choosing (a) rather than (b), e.g. was
that also ease of implementation? If there isn't a rationale otherwise,
and if LLVM could also do (b), that would be nice from my PoV.

How big is "a bit more changes" for GCC?

Thanks,
Mark.