This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope


On Fri, May 06, 2016 at 01:04:30PM +0200, Martin LiÅka wrote:
> I've started working on the patch couple of month go, basically after
> a brief discussion with Jakub on IRC.
> 
> I'm sending the initial version which can successfully run instrumented
> tramp3d, postgresql server and Inkscape. It catches the basic set of
> examples which are added in following patch.
> 
> The implementation is quite straightforward as works in following steps:
> 
> 1) Every local variable stack slot is poisoned at the very beginning of a function (RTL emission)
> 2) In gimplifier, once we spot a DECL_EXPR, a variable is unpoisoned (by emitting ASAN_MARK builtin)
> and the variable is marked as addressable

Not all vars have DECL_EXPRs though.

> 3) Similarly, BIND_EXPR is the place where we poison the variable (scope exit)
> 4) At the very end of the function, we clean up the poisoned memory
> 5) The builtins are expanded to call to libsanitizer run-time library (__asan_poison_stack_memory, __asan_unpoison_stack_memory)
> 6) As the use-after-scope stuff is already included in libsanitizer, no change is needed for the library

> As mentioned, it's request for comment as it still has couple of limitations:
> a) VLA are not supported, which should make sense as we are unable to allocate a stack slot for that
> b) we can possibly strip some instrumentation in situations where a variable is introduced in a very first BB (RTL poisoning is superfluous).
> Similarly for a very last BB of a function, we can strip end of scope poisoning (and RTL unpoisoning). I'll do that incrementally.

Yeah.

> c) We require -fstack-reuse=none option, maybe it worth to warn a user if -fsanitize=use-after-scope is provided without the option?

This should be implicitly set by -fsanitize=use-after-scope.  Only if some
other -fstack-reuse= option is explicitly set together with
-fsanitize=use-after-scope, we should warn and reset it anyway.

> d) An instrumented binary is quite slow (~20x for tramp3d) as every function call produces many memory read/writes. I'm wondering whether
> we should provide a faster alternative (like instrument just variables that have address taken) ?

I don't see any point in instrumenting !needs_to_live_in_memory vars,
at least not those that don't need to live in memory at gimplification time.
How could one use those after scope?  They can't be accessed by
dereferencing some pointer, and the symbol itself should be unavailable for
symbol lookup after the symbol goes out of scope.
Plus obviously ~20x slowdown isn't acceptable.

Another thing is what to do with variables that are addressable at
gimplification time, but generally are made non-addressable afterwards,
such as due to optimizing away the taking of their address, inlining, etc.

Perhaps depending on how big slowdown you get after just instrumenting
needs_to_live_in_memory vars from ~ gimplification time and/or with the
possible inlining of the poisoning/unpoisoning (again, should be another
knob), at least with small sized vars, there might be another knob,
which would tell if vars that are made non-addressable during optimizations
later on should be instrumented or not.
E.g. if you ASAN_MARK all needs_to_live_in_memory vars early, you could
during the addressable determination when the knob says stuff should be
faster, but less precise, ignore the vars that are addressable just because
of the ASAN_MARK calls, and if they'd then turn to be non-addressable,
remove the corresponding ASAN_MARK calls.

> 2016-05-04  Martin Liska  <mliska@suse.cz>
> 
> 	* asan/asan_poisoning.cc: Do not call PoisonShadow in case
> 	of zero size of aligned size.

Generally, libsanitizer changes would need to go through upstream.

> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "varasm.h"
>  #include "stor-layout.h"
>  #include "tree-iterator.h"
> +#include "params.h"
>  #include "asan.h"
>  #include "dojump.h"
>  #include "explow.h"
> @@ -54,7 +55,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfgloop.h"
>  #include "gimple-builder.h"
>  #include "ubsan.h"
> -#include "params.h"
>  #include "builtins.h"
>  #include "fnmatch.h"

Why do you need to move params.h around?  Does asan.h now depend on
params.h?

> +  gimplify_seq_add_stmt
> +    (seq_p, gimple_build_call_internal (IFN_ASAN_MARK, 3,
> +					build_int_cst (integer_type_node,
> +						       flags),
> +					base, unit_size));

Formatting, better introduce some temporary variables, like
  gimple *g = gimple_build_call_internal (...);
  gimplify_seq_add_stmt (seq_p, g);

> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -3570,7 +3570,8 @@ vect_recog_mask_conversion_pattern (vec<gimple *> *stmts, tree *type_in,
>  {
>    gimple *last_stmt = stmts->pop ();
>    enum tree_code rhs_code;
> -  tree lhs, rhs1, rhs2, tmp, rhs1_type, rhs2_type, vectype1, vectype2;
> +  tree lhs = NULL_TREE, rhs1, rhs2, tmp, rhs1_type, rhs2_type;
> +  tree vectype1, vectype2;
>    stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
>    stmt_vec_info pattern_stmt_info;
>    vec_info *vinfo = stmt_vinfo->vinfo;

How is the above related to this patch?

> +/* Return TRUE if we should instrument for use-after-scope sanity checking.  */                                                                   
> +                                                                                                                                                  
> +static inline bool                                                                                                                                
> +asan_sanitize_use_after_scope (void)                                                                                                              
> +{                                                                                                                                                 
> +  return (flag_sanitize & SANITIZE_ADDRESS_USE_AFTER_SCOPE)                                                                                       
> +    == SANITIZE_ADDRESS_USE_AFTER_SCOPE                                                                                                           
> +    && flag_stack_reuse == SR_NONE                                                                                                                
> +    && ASAN_STACK;                                                                                                                                
> +}                                                                                                                                                 

Formatting, there should be ()s around the whole return expression as it
spans multiple lines, and it should be indented properly.
Plus IMHO flag_stack_reuse should be dealt with during option handling.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]