This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Sparc ASAN


On Tue, Nov 20, 2012 at 11:19:33PM -0500, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 20 Nov 2012 21:20:40 -0500 (EST)
> 
> > Those seem to be the only problems that need to be resolved for this
> > feature to be fully working.
> 
> FWIW, here are the changes I am using which, besides the sparc backend
> bits, has some temporary workarounds for the issues I brought up.

To explain a little bit these unaligned stores.
>From what I understand LLVM will always put all the ASAN protected
automatic vars into a single block and align the whole block to 32 bytes
(on i?86/x86_64 in GCC that is the backend dynamic stack realignment,
those andq $-32, %rsp etc. in the prologue, on other targets rth added
a release or two ago handling of these via alloca that where the alloca
allocates up to the alignment - 1 bytes more and we add alignment - 1
to the result of alloca and mask it).
But it seemed nothing on the libasan relies on it actually being 32-byte
aligned, so what I've implemented was keep the vars as aligned as they
were before (well, the base), and just put the padding in between as if
the base was 32-byte aligned and I wanted to align them all to 32-bytes
(ignoring here in the description more aligned vars etc.).
That of course means the shadow memory stores may be unaligned, but as
i?86/x86_64 were the only supported targets initially, and the unaligned
stores are IMHO pretty fast there, it isn't a big deal.

Now, for strict alignment targets, the question is what is the fastest
way to implement this.  There is the cost of dynamically realigning the
stack var block (which isn't just about the alloca at the beginning, but
also pretty much having one extra register reserved for the result of
the alloca, used as the base to access all the local vars, sure, you can
spill it, but if it is used frequently, it will likely be assigned to a hard
reg most of its lifetime), and on the other side there is the cost of
accessing the shadow memory at smaller granularity.

If some target has the virtual-stack-vars reg not even 8 byte aligned,
some realignment is required, as the shadow shift is 3.
If it is just 8 byte aligned and we don't realign the block, then yes,
shadow memory needs to be accessed by bytes, if it is 16 byte aligned,
perhaps we could use 16-bit stores instead of just 8-bit ones.
And perhaps the decision (for strict alignment arches) whether to realign or
not the whole block could be best dynamic, for small number of asan
protected vars (i.e. small number of expected shadow memory stores)
we could just not realign and use 8-bit or 16-bit shadow memory stores,
for larger amounts of vars perhaps it might be cheaper to realign.

So, concerning your patch, which implements basically the never realign,
just sometimes use smaller size of shadow memory accesses, the decision
shouldn't be done solely based on STRICT_ALIGNMENT, but needs to take
into account the alignment of virtual_stack_vars_rtx in the current
function, if it's byte alignment is >= (4 << ASAN_SHADOW_SHIFT), then
you can use the same code as for !STRICT_ALIGNMENT, if it's byte alignment
is >= (2 << ASAN_SHADOW_SHIFT), then you can use HImode stores, otherwise
QImode.

> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -321,7 +321,10 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
>  			      NULL_RTX, 1, OPTAB_DIRECT);
>    gcc_assert (asan_shadow_set != -1
>  	      && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
> -  shadow_mem = gen_rtx_MEM (SImode, shadow_base);
> +  if (STRICT_ALIGNMENT)
> +    shadow_mem = gen_rtx_MEM (QImode, shadow_base);
> +  else
> +    shadow_mem = gen_rtx_MEM (SImode, shadow_base);
>    set_mem_alias_set (shadow_mem, asan_shadow_set);
>    prev_offset = base_offset;
>    for (l = length; l; l -= 2)
> @@ -349,7 +352,20 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
>  	      }
>  	    else
>  	      shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
> -	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
> +	  if (STRICT_ALIGNMENT)
> +	    {
> +	      for (i = 0; i < 4; i++)
> +		{
> +		  rtx mem = adjust_address (shadow_mem, VOIDmode, i);
> +		  rtx val;
> +
> +		  val = GEN_INT (trunc_int_for_mode (shadow_bytes[i],
> +						     QImode));
> +		  emit_move_insn (mem, val);
> +		}
> +	    }
> +	  else
> +	    emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
>  	  offset = aoff;
>  	}
>        while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
> @@ -359,7 +375,22 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
>  				       >> ASAN_SHADOW_SHIFT);
>  	  prev_offset = offset;
>  	  memset (shadow_bytes, cur_shadow_byte, 4);
> -	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
> +	  if (STRICT_ALIGNMENT)
> +	    {
> +	      int i;
> +
> +	      for (i = 0; i < 4; i++)
> +		{
> +		  rtx mem = adjust_address (shadow_mem, VOIDmode, i);
> +		  rtx val;
> +
> +		  val = GEN_INT (trunc_int_for_mode (shadow_bytes[i],
> +						     QImode));
> +		  emit_move_insn (mem, val);
> +		}
> +	    }
> +	  else
> +	    emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
>  	  offset += ASAN_RED_ZONE_SIZE;
>  	}
>        cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]