This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: stack/heap collision vulnerability and mitigation with GCC


On 19/06/17 18:07, Jeff Law wrote:
> As some of you are likely aware, Qualys has just published fairly
> detailed information on using stack/heap clashes as an attack vector.
> Eric B, Michael M -- sorry I couldn't say more when I contact you about
> -fstack-check and some PPC specific stuff.  This has been under embargo
> for the last month.
> 
> 
> --
> 
> 
> http://www.openwall.com/lists/oss-security/2017/06/19/1
> 
[...]
> aarch64 is significantly worse.  There are no implicit probes we can
> exploit.  Furthermore, the prologue may allocate stack space 3-4 times.
> So we have the track the distance to the most recent probe and when that
> distance grows too large, we have to emit a probe.  Of course we have to
> make worst case assumptions at function entry.
> 

I'm not sure I understand what you're saying here.  According to the
comment above aarch64_expand_prologue, the stack frame looks like:

+-------------------------------+
|                               |
|  incoming stack arguments     |
|                               |
+-------------------------------+
|                               | <-- incoming stack pointer (aligned)
|  callee-allocated save area   |
|  for register varargs         |
|                               |
+-------------------------------+
|  local variables              | <-- frame_pointer_rtx
|                               |
+-------------------------------+
|  padding0                     | \
+-------------------------------+  |
|  callee-saved registers       |  | frame.saved_regs_size
+-------------------------------+  |
|  LR'                          |  |
+-------------------------------+  |
|  FP'                          | / <- hard_frame_pointer_rtx (aligned)
+-------------------------------+
|  dynamic allocation           |
+-------------------------------+
|  padding                      |
+-------------------------------+
|  outgoing stack arguments     | <-- arg_pointer
|                               |
+-------------------------------+
|                               | <-- stack_pointer_rtx (aligned)

Now for the majority of frames the amount of local variables is small
and there is neither dynamic allocation nor the need for outgoing local
variables.  In this case the first instruction in the function is

	stp	fp, lr, [sp, #-FrameSize]!

So this instruction allocates all the stack needed and acts stores the
required registers.  That acts as an implicit probe as far as I can tell.


If the locals area gets slightly larger (>= 512 bytes) then the sequence
becomes
	sub	sp, sp, #FrameSize
	stp	fp, lr, [sp]

But again this acts as a sufficient implicit probe provided that
FrameSize does not exceed the probe interval.

Yes, we need more implicit probes if the local variable space becomes
large and we need additional probes for checking the outgoing area and
the dynamic area, but again, if those are small (< 512) we could replace
the existing
	sub	sp, sp, #n
with
	str	xzr, [sp, #-n]!

and thus the explicit probe now becomes the stack allocation operation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]