This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)

From: Jakub Jelinek <jakub at redhat dot com>
To: Yury Gribov <y dot gribov at samsung dot com>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Konstantin Serebryany <kcc at gcc dot gnu dot org>, Viacheslav Garbuzov <v dot garbuzov at samsung dot com>, Dmitry Vyukov <dvyukov at google dot com>
Date: Wed, 11 Jun 2014 11:31:50 +0200
Subject: Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
Authentication-results: sourceware.org; auth=none
References: <538D8136 dot 3030402 at samsung dot com> <20140603083751 dot GU10386 at tucnak dot redhat dot com> <53914CC1 dot 30104 at samsung dot com> <20140606060153 dot GK10386 at tucnak dot redhat dot com> <5395D75E dot 8090900 at samsung dot com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Mon, Jun 09, 2014 at 07:48:46PM +0400, Yury Gribov wrote:
> >>Build_check_stmt is now unified for const/variable lengths
> >>but I'd still prefer to treat the use_calls case specially
> >>because calling single __asan_loadN is more space-efficient
> >>than emitting two separate calls with a length check.
> >
> >That is not what I meant.
> >1) instrument_mem_region_access/instrument_strlen should use a single
> >   build_check_stmt call instead of two that they do now,
> >   both for use_calls case and !use_calls case, the difference
> >   is just that instrument_mem_region_access needs to guard
> >   it against len == 0, while for use_calls it doesn't have to,
> >   it seems __asan_{load,store}N handles length of 0 correctly
> >   (still, the caller shouldn't update_mem_ref_hash_table
> >   if length might be zero, even in the use_calls case)
> >2) passing always a tree length to build_check_stmt and then
> >   recreating size_in_bytes out of it is ugly, just add a new
> >   tree parameter, if it is non-NULL, the length is variable,
> >   and real_size_in_bytes should be 1, and the code additionally
> >   has to compute length - 1 at runtime and add the result,
> >   otherwise it just keeps adding precomputed constant
> 
> Let's check if I'm on the right way. I've attached current version. It may
> have some rough edges but at least all regtests pass on x64.

The plan (we had already for 4.9, but didn't get to that yet) is in the end
not to lower the checks in asan pass that much, and lower it in sanopt
pass later on after performing some inter-bb optimizations.
The plan has been to use internal builtins, but the __builtin___asan_loadX
etc. perhaps could serve the same purpose, with the only problem being that
__asan_loadN handles zero, while the inline expansion thereof doesn't unless
it explicitly checks for zero value.  But don't want to hold your patch
for that.  Once we switch to that model, we'd simply "inline" the __asan_*
builtins in the sanopt pass until we hit the limit.

The reason for the plan is that it will be easier in the sanopt pass to
try to remove redundant instrumentation, e.g. when dominator bb already
checks a particular address/length and no calls that could change the
validity of the checks happen between the dominator bb and the dominated
__asan_load/storeX.

> Some obvious problems:
> 
> 1) build_check_stmt argument list has become really long and messy;
> (this could be simplified to some extent if we get rid of size_of_bytes and
> non_zero_len_p and recompute them from len instead
> but all other parameters seem really unavoidable)

I think that is fine.

> 2) we can't easily check number of instrumentations in
> no-redundant-instrumentation-*.c family of tests
> because memory region accesses now have single error handler
> (__asan_report_load_n/store_n);
> the best solution I came up with is counting all 'X & 7' patterns.

Ok.

> 3) strlen instrumentation is still using two separate calls to
> build_check_stmt (due to reasons I mentioned above).

If for strlen we check before the call the first byte and after the
call the last byte, then indeed we want two calls.  Or we could check
everything after the call and use one call after it.

> +static void
> +build_check_stmt_with_calls (location_t loc, tree base, tree len,
> +                  HOST_WIDE_INT size_in_bytes, gimple_stmt_iterator *iter,
> +                  bool before_p, bool is_store, bool is_scalar_access)

Whitespace.  Please use tabs instead of spaces, and align the lines below
location_t.

> @@ -1501,111 +1608,198 @@ build_shadow_mem_access (gimple_stmt_iterator *gsi, location_t location,
>     SSA_NAME, or a non-SSA expression.  LOCATION is the source code
>     location.  IS_STORE is TRUE for a store, FALSE for a load.
>     BEFORE_P is TRUE for inserting the instrumentation code before
> -   ITER, FALSE for inserting it after ITER.
> +   ITER, FALSE for inserting it after ITER. IS_SCALAR_ACCESS is TRUE
> +   for a scalar memory access and FALSE for memory region access.
> +   NON_ZERO_P is TRUE if memory region is guaranteed to have non-zero
> +   length. ALIGN tells alignment of accessed memory object.
> +
> +   START_INSTRUMENTED and END_INSTRUMENTED are TRUE if start/end of
> +   memory region have already been instrumented.
>  
>     If BEFORE_P is TRUE, *ITER is arranged to still point to the
>     statement it was pointing to prior to calling this function,
>     otherwise, it points to the statement logically following it.  */
>  
>  static void
> -build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
> -		  bool before_p, bool is_store, HOST_WIDE_INT size_in_bytes,
> -		  bool slow_p = false)
> +build_check_stmt (location_t location, tree base, tree len,
> +                  HOST_WIDE_INT size_in_bytes, gimple_stmt_iterator *iter,
> +                  bool non_zero_len_p, bool before_p, bool is_store,
> +                  bool is_scalar_access, unsigned int align = 0,
> +                  bool start_instrumented = false,
> +                  bool end_instrumented = false)

Whitespace again.  Also, when it is one call or two minor checks in one
build_check_stmt call, we shouldn't record in the hash table start and end,
just the start, plus for NULL len also size_in_bytes, for non-NULL len
probably nothing, we don't know if len isn't zero.
> +
> +  if (len)
> +    {
> +      len = unshare_expr (len);
> +    }

Formatting.  Single line blocks shouldn't have {}s around it (several
times).

	Jakub

Follow-Ups:
- Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Yury Gribov

References:
- [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Yury Gribov
- Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Jakub Jelinek
- Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Yury Gribov
- Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Jakub Jelinek
- Re: [PATCH] Support asan-instrumentation-with-call-threshold in GCC (second try)
  - From: Yury Gribov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]