This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Improving Asan code on ARM targets


Hi all,

I've recently noticed that GCC generates suboptimal code for Asan on ARM targets. E.g. for a 4-byte memory access check

    (shadow_val != 0) & (last_byte >= shadow_val)

we get the following sequence:

    mov    r2, r0, lsr #3
    and    r3, r0, #7
    add    r3, r3, #3
    add    r2, r2, #536870912
    ldrb    r2, [r2]    @ zero_extendqisi2
    sxtb    r2, r2
    cmp    r3, r2
    movlt    r3, #0
    movge    r3, #1
    cmp    r2, #0
    moveq    r3, #0
    cmp    r3, #0
    bne    .L5
    ldr    r0, [r0]

Obviously a shorter code is possible:

    mov    r3, r0, lsr #3
    and    r1, r0, #7
    add    r1, r1, #4
    add    r3, r3, #536870912
    ldrb    r3, [r3]    @ zero_extendqisi2
    sxtb    r3, r3
    cmp    r3, #0
    cmpne    r1, r3
    bgt    .L5
    ldr    r0, [r0]

A 30% improvement looked quite important given that Asan usually increases code-size by 1.5-2x so I decided to investigate this. It turned out that ARM backend already has full support for dominated comparisons (cmp-cmpne-bgt sequence above) and can generate efficient code if we provide it with a slightly more explicit gimple sequence:

    (shadow_val != 0) & (last_byte + 1 > shadow_val)

Ideally backend should be able perform this transform itself. But I'm not sure this is possible: it needs to know that last_range + 1 can not overflow and this info is not available in RTL (because we don't have VRP pass there).

I have attached a simple patch which changes Asan pass to generate the ARM-friendly code. I've only bootstrapped/regtested on x64 but I can perform additional tests on ARM if the patch make sense. As far as I can tell it does not worsen sanitized code on other platforms (x86/x64) while significantly improving ARM (15% less code for bzip).

The patch is certainly not ideal:
* it makes target-specific changes in machine-independent code
* it does not help with 1-byte accesses (forwprop pass thinks that it's always beneficial to convert x + 1 > y to x >= y so it reverts my change) * it only improves Asan code whereas it would be great if ARM backend could improve generic RTL code but it achieves significant improvement on ARM without hurting other platforms.

So my questions are:
* is this kind of target-specific tweaking acceptable in middle-end?
* if not - what would be a better option?

-Y
2014-04-29  Yury Gribov  <y.gribov@samsung.com>

	* asan.c (build_check_stmt): Change generated code to improve
	code generated for ARM.

diff --git a/gcc/asan.c b/gcc/asan.c
index d7c282e..f00705a 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1543,18 +1543,17 @@ build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
     {
       /* Slow path for 1, 2 and 4 byte accesses.
 	 Test (shadow != 0)
-	      & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+	      & ((base_addr & 7) + size_in_bytes) > shadow).  */
       gimple_seq seq = NULL;
       gimple shadow_test = build_assign (NE_EXPR, shadow, 0);
       gimple_seq_add_stmt (&seq, shadow_test);
       gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, base_addr, 7));
       gimple_seq_add_stmt (&seq, build_type_cast (shadow_type,
                                                   gimple_seq_last (seq)));
-      if (size_in_bytes > 1)
-        gimple_seq_add_stmt (&seq,
-                             build_assign (PLUS_EXPR, gimple_seq_last (seq),
-                                           size_in_bytes - 1));
-      gimple_seq_add_stmt (&seq, build_assign (GE_EXPR, gimple_seq_last (seq),
+      gimple_seq_add_stmt (&seq,
+                           build_assign (PLUS_EXPR, gimple_seq_last (seq),
+                                         size_in_bytes));
+      gimple_seq_add_stmt (&seq, build_assign (GT_EXPR, gimple_seq_last (seq),
                                                shadow));
       gimple_seq_add_stmt (&seq, build_assign (BIT_AND_EXPR, shadow_test,
                                                gimple_seq_last (seq)));

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]