This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block


On Mon, Jan 28, 2013 at 10:32:43PM +0100, Dodji Seketeli wrote:

Thanks for working on this.

> @@ -212,6 +213,159 @@ alias_set_type asan_shadow_set = -1;
>     alias set is used for all shadow memory accesses.  */
>  static GTY(()) tree shadow_ptr_types[2];
>  
> +/* Hashtable support for memory references used by gimple
> +   statements.  */
> +
> +/* This type represents a reference to a memory region.  */
> +struct __attribute__ ((visibility ("hidden"))) mem_ref

This is wrong, you aren't checking whether visibility or even hidden
visibility is supposed by the compiler.

The C++ way I think would be to use anonymous namespace,
but I don't see anything like that used in gcc/*.c yet, so
perhaps just name it asan_mem_ref and be done with that.

> +{
> +  /* The expression of the begining of the memory region.  */
> +  tree start;
> +  /* The expression representing the length of the region.  */
> +  tree len;

Do you really need to store len as tree?  Wouldn't it be better
just to store the access size in bytes as integer (1, 2, 4, 8 or 16
bytes) into say unsigned char; variable?

> +struct __attribute__((visibility ("hidden"))) mem_ref_hasher
> +  : typed_delete_remove <mem_ref>

Likewise.

> +static hash_table <mem_ref_hasher>&

Not sure about agreed formatting for references, but grep '>&' doesn't
show anything, while '> &' shows some hits.

> +get_mem_ref_hash_table ()
> +{
> +    static hash_table <mem_ref_hasher> ht;
> +
> +    if (!ht.is_created ())
> +      ht.create (10);
> +
> +    return ht;

The above is all indented too much.

> +  hash_table <mem_ref_hasher> ht = get_mem_ref_hash_table ();
> +  mem_ref **slot = ht.find_slot (&ref, INSERT);
> +  gcc_assert (*slot == NULL);
> +  *slot = new mem_ref (ref);

Wouldn't it be better to use pool_alloc/pool_free here instead of
new/delete?

>      case BUILT_IN_STRLEN:
> -      return instrument_strlen_call (iter);
> +      source0 = gimple_call_arg (call, 0);

Reminds me that we should replace all uses of is_gimple_builtin_call
in asan.c/tsan.c with gimple_call_builtin_p (stmt, BUILT_IN_NORMAL),
and probably nuke is_gimple_builtin_call, otherwise no verification
whether K&R unprototyped size_t strlen (); etc. aren't used
say with x = strlen (); .  Right now gimple_call_arg here is unsafe,
we haven't checked, whether it has at least one argument.  But if we
ignore calls which don't match the builtin prototype, we'd be safe.

> +static int
> +test1 ()
> +{
> +  /*__builtin___asan_report_store1 called 1 time here to instrument
> +    the initialization.  */
> +  char foo[4] = {1}; 
> +
> +  /*__builtin___asan_report_store1 called 2 times here to instrument
> +    the store to the memory region of tab.  */
> +  __builtin_memset (tab, 3, sizeof (tab));
> +
> +  /* There is no instrumentation for the two memset calls below.  */
> +  __builtin_memset (tab, 4, sizeof (tab));
> +  __builtin_memset (tab, 5, sizeof (tab));

If you don't instrument the above two, it shows something bad.
I'd say for calls which test two bytes (first and last) you actually
should enter into the hash table two records, one that &tab[0] with size 1
byte has been instrumented, another one that &tab[3] (resp. 4, resp. 5)
with size 1 byte has been instrumented, and e.g. if the first access has
been instrumented already, but second access hasn't (or vice versa), just
emit instrumentation for the one which is missing.

For future improvements (but 4.9 material likely, after we lower
each instrumentation just to a single builtin first), different access
sizes and/or different is_store should just mean we (at least in the same bb
and with no intervening calls that could never return) could upgrade the
the previous instrumentation to cover a wider access size, or load into
store (or just ignore loads vs. stores?).

> +/* 

I wouldn't start the comment on different line.

> +   { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "instrumented stores"}  }

Space between " and }, only one space between }  }.

> +
> +   { dg-final { scan-tree-dump-times "__builtin___asan_report_load" 4 "instrumented loads"}  }

and just close */ at the end of the above line (or use /* */ on each line
separately.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]