This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits)


On 11/27/2017 11:44 AM, James Greenhalgh wrote:
> On Wed, Nov 22, 2017 at 06:28:24PM +0000, Jeff Law wrote:
>> On 11/21/2017 04:57 AM, James Greenhalgh wrote:
>>> I've finally built up enough courage to start getting my head around this...
>> Can't blame you for avoiding :-) This stuff isn't my idea of fun either.
> 
> Right, here's where I'm up to... I think that I have a reasonable grasp of
> the overall intentions for this patch and what we're trying to do.
Good.  I realize it's somewhat painful.  Wilco's feedback WRT what is
and what is not possible in aarch64 prologues has been invaluable in
simplifying things.  But in the end it's just painful.



> 
> We want to put out regular probes (writes of zero) to the stack, at a regular
> interval (given by PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL ), starting
> at a known offset from the stack
> base ( PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE ). There's some subtetly
> around when and where we need probes, what the efficient sequences to
> achieve that would be, etc. and handling that complexity, and the trouble
> of having two possible stack adjustments to protect (initial adjustment and
> final adjustment), is where most of the body of the patch comes from.
Right.

> 
> So, as far as I can read there are a couple of design points that I need
> clarified. If we get through these it might be possible for Wilco to pick
> up the patch and optimise the cases he's marked as a concern earlier.
> 
> Let me try to expand on these...
> 
> One point of contention in the review so far has been whether the default
> values for PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL and
> PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE are correct for AArch64. As this
> has some implications for the algorithm we would like to use for probing,
> where exactly we'd like to emit the probes, and might need a glibc patch
> it will be helpful to narrow this down to an answer.
So the guard size came out of a discussion with Richard Earnshaw.  I
think there's a general agreement on that.  It was explicitly decided to
punt ILP32 on aarch64 and not worry about it.  So that's really the one
thing that could throw a wrench into the guard size.


> 
> I see two positions.
> 
> Wilco would like both parameters set to 16 - that means (ignoring the
> caller-protected stuff for now) a guard page size of 64k, and a probe
> every 64k.
> 
> Jeff's current patch has a 64k guard page size, and a probe every 4k.
> 
> What concerns me is this from Jeff from couple of emails back:
> 
>> Given the ISA I wouldn't expect an interval > 12 to be useful or necessarily
>> even work correctly.
> 
> There's a clear difference of opinion here!
So the issue is that once you get an interval > 12 you're unable to
adjust the stack pointer in a single instruction without using a scratch
register.

*Most* of the code tries to handle this.  The notable exception is
aarch64_output_probe_stack_range which is used by both stack clash
protection and Ada style -fstack-check.

For stack clash prologues we get there via

aarch64_expand_prologue -> aarch64_allocate_and_probe_stack_space

Which implies that we have to have another scratch register in the
prologue code.  We already use IP0_REGNUM and IP1_REGNUM depending on
context.  So neither of those is available.  I'm not familiar enough
with aarch64 to know if there's another register we can safely use.

Without another register we can safely use, probing at intervals > 4k is
a non-starter AFAICT.

> I'd guess the difference of opinion is whether you need to probe every page
> (4k) or whether you need to probe on each multiple of the minimum guard page
> size (64k), but maybe it is just a difficulty in juggling registers? For
> Wilco's position to work, we'd need to be happy emitting a probe once for
> every 64k - from the sounds of the other comments on this thread, there's no
> problem with that - we could have a minimum size of 64k for
> PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL (though the current code would
> probably need to change to accomodate that). Clearly we can reduce the
> number of probes required to 1/16th if this were feasible.
It's primarily a matter of getting a regsiter.

> 
> So, first design question - other than potential troubles grabbing scratch
> registers (we'll revisit this), is there a reason that would stop us deciding
> to default to 64k for both parameters on AArch64?
Florian might have comments here.


> 
> Where this gets messy is handling the stack write. The current design 
> will drop the stack pointer, then emit a probe back near the top where we
> were previosuly sitting. i.e.
> 
>   sp = sp - 4096
>   *(sp + 4088) = 0
> 
> For larger probe intervals, we end up juggling bigger numbers, which makes
> the calculation slightly more unwiedly.
Right.

> 
> One suggestion from Wilco would be to change this sequence to:
> 
>   sp = sp - 4096
>   *(sp) = 0
> 
> In other words, drop the stack and then emit a probe where we are now
> sitting. That would save us juggling the large number for the load offset.
I think this runs afoul of valgrind.  Being valgrind-safe is IMHO a
design requirement.  I've already been chastised for hitting the red
zone too many times :-)  I'm told there is no red zone in the aarch64 ABI.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]