This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Improving Asan code on ARM targets

From: Andrew Pinski <pinskia at gmail dot com>
To: Yury Gribov <y dot gribov at samsung dot com>
Cc: gcc <gcc at gcc dot gnu dot org>, Konstantin Serebryany <kcc at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>, Richard Earnshaw <rearnsha at arm dot com>, Viacheslav Garbuzov <v dot garbuzov at samsung dot com>
Date: Mon, 28 Apr 2014 23:04:34 -0700
Subject: Re: Improving Asan code on ARM targets
Authentication-results: sourceware.org; auth=none
References: <535F3DC0 dot 4070207 at samsung dot com>

On Mon, Apr 28, 2014 at 10:50 PM, Yury Gribov <y.gribov@samsung.com> wrote:
> Hi all,
>
> I've recently noticed that GCC generates suboptimal code for Asan on ARM
> targets. E.g. for a 4-byte memory access check
>
>     (shadow_val != 0) & (last_byte >= shadow_val)
>
> we get the following sequence:
>
>     mov    r2, r0, lsr #3
>     and    r3, r0, #7
>     add    r3, r3, #3
>     add    r2, r2, #536870912
>     ldrb    r2, [r2]    @ zero_extendqisi2
>     sxtb    r2, r2
>     cmp    r3, r2
>     movlt    r3, #0
>     movge    r3, #1
>     cmp    r2, #0
>     moveq    r3, #0
>     cmp    r3, #0
>     bne    .L5
>     ldr    r0, [r0]
>
> Obviously a shorter code is possible:
>
>     mov    r3, r0, lsr #3
>     and    r1, r0, #7
>     add    r1, r1, #4
>     add    r3, r3, #536870912
>     ldrb    r3, [r3]    @ zero_extendqisi2
>     sxtb    r3, r3
>     cmp    r3, #0
>     cmpne    r1, r3
>     bgt    .L5
>     ldr    r0, [r0]

Does the patch series at located at:
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01405.html
Fix this code generation issue?  I suspect it does and improves more
than just the above code.

Thanks,
Andrew Pinski

>
> A 30% improvement looked quite important given that Asan usually increases
> code-size by 1.5-2x so I decided to investigate this. It turned out that ARM
> backend already has full support for dominated comparisons (cmp-cmpne-bgt
> sequence above) and can generate efficient code if we provide it with a
> slightly more explicit gimple sequence:
>
>     (shadow_val != 0) & (last_byte + 1 > shadow_val)
>
> Ideally backend should be able perform this transform itself. But I'm not
> sure this is possible: it needs to know that last_range + 1 can not overflow
> and this info is not available in RTL (because we don't have VRP pass
> there).
>
> I have attached a simple patch which changes Asan pass to generate the
> ARM-friendly code. I've only bootstrapped/regtested on x64 but I can perform
> additional tests on ARM if the patch make sense. As far as I can tell it
> does not worsen sanitized code on other platforms (x86/x64) while
> significantly improving ARM (15% less code for bzip).
>
> The patch is certainly not ideal:
> * it makes target-specific changes in machine-independent code
> * it does not help with 1-byte accesses (forwprop pass thinks that it's
> always beneficial to convert x + 1 > y to x >= y so it reverts my change)
> * it only improves Asan code whereas it would be great if ARM backend could
> improve generic RTL code
> but it achieves significant improvement on ARM without hurting other
> platforms.
>
> So my questions are:
> * is this kind of target-specific tweaking acceptable in middle-end?
> * if not - what would be a better option?
>
> -Y

Follow-Ups:
- Re: Improving Asan code on ARM targets
  - From: Konstantin Serebryany

References:
- Improving Asan code on ARM targets
  - From: Yury Gribov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]