[Patch 0/X] HWASAN v3

Wed Jan 8 11:26:00 GMT 2020

Hi everyone,

I'm writing this email to summarise & publicise the state of this patch 
series, especially the difficulties around approval for GCC 10 mentioned 
on IRC.

The main obstacle seems to be that no maintainer feels they have enough 
knowledge about hwasan and justification that it's worthwhile to approve 
the patch series.

Similarly, Martin has given a review of the parts of the code he can 
(thanks!), but doesn't feel he can do a deep review of the code related 
to the RTL hooks and stack expansion -- hence that part is as yet not 
reviewed in-depth.

The questions around justification raised on IRC are mainly that it 
seems like a proof-of-concept for MTE rather than a stand-alone useable 
sanitizer.  Especially since in the GNU world hwasan instrumented code 
is not really ready for production since we can only use the 
less-"interceptor ABI" rather than the "platform ABI".  This restriction 
is because there is no version of glibc with the required modifications 
to provide the "platform ABI".

(n.b. that since https://reviews.llvm.org/D69574 the code-generation for 
these ABI's is the same).

 From my perspective the reasons that make HWASAN useful in itself are:

1) Much less memory usage.

 From a back-of-the-envelope calculation based on the hwasan paper's 
table of memory overhead from over-alignment 
https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code 
has an overhead of about 1.1x (~4% from overalignment and ~6.25% from 
shadow memory), while asan seems to have an overhead somewhere in the 
range 1.5x - 3x.

Maybe there's some data out there comparing total overheads that I 
haven't found? (I'd appreciate a reference if anyone has that info).

2) Available on more architectures that MTE.

HWASAN only requires TBI, which is a feature of all AArch64 machines, 
while MTE will be an optional extension and only available on certain 
architectures.

3) This enables using hwasan in the kernel.

While instrumented user-space applications will be using the 
"interceptor ABI" and hence are likely not production-quality, the 
biggest aim of implementing hwasan in GCC is to allow building the Linux 
kernel with tag-based sanitization using GCC.

Instrumented kernel code uses hooks in the kernel itself, so this ABI 
distinction is no longer relevant, and this sanitizer should produce a 
production-quality kernel binary.

I'm hoping I can find a maintainer willing to review and ACK this patch 
series -- especially with stage3 coming to a close soon.  If there's 
anything else I could do to help get someone willing up-to-speed then 
please just ask.

Cheers,
Matthew

On 07/01/2020 15:14, Martin Liška wrote:
> On 12/12/19 4:18 PM, Matthew Malcomson wrote:
> 
> Hello.
> 
> I've just sent few comments that are related to the v3 of the patch set.
> Based on the HWASAN (limited) knowledge the patch seems reasonable to me.
> I haven't looked much at the newly introduced RTL-hooks.
> But these seems to me isolated to the aarch64 port.
> 
> I can also verify that the patchset works on my aarch64 linux machine and
> hwasan.exp and asan.exp tests succeed.
> 
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory 
>> tagging,
>> since I'm not sure the way I found to implement this would be 
>> acceptable.  The
>> inlined patch below works but it requires a special declaration 
>> instead of just
>> an ~#include~.
> 
> Knowing that, I would not bother with the printing of HWASAN_MARK.
> 
> Thanks for the series,
> Martin