This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/7)]


Hi Richard,

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Tuesday, September 11, 2018 15:49
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: Jeff Law <law@redhat.com>; gcc-patches@gcc.gnu.org; nd
> <nd@arm.com>; James Greenhalgh <James.Greenhalgh@arm.com>;
> Richard Earnshaw <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>
> Subject: Re: [PATCH][GCC][AArch64] Updated stack-clash implementation
> supporting 64k probes. [patch (1/7)]
> 
> Tamar Christina <Tamar.Christina@arm.com> writes:
> > Hi Richard,
> >
> > The 08/28/2018 21:58, Richard Sandiford wrote:
> >> Tamar Christina <Tamar.Christina@arm.com> writes:
> >> > +  HOST_WIDE_INT guard_used_by_caller =
> STACK_CLASH_CALLER_GUARD;
> >> > + /* When doing the final adjustment for the outgoing argument size
> >> > we can't
> >> > + assume that LR was saved at position 0.  So subtract it's offset
> >> > from the
> >> > +     ABI safe buffer so that we don't accidentally allow an adjustment
> that
> >> > +     would result in an allocation larger than the ABI buffer without
> >> > +     probing.  */
> >> > +  HOST_WIDE_INT min_probe_threshold
> >> > +    = final_adjustment_p
> >> > +      ? guard_used_by_caller - cfun->machine-
> >frame.reg_offset[LR_REGNUM]
> >> > +      : guard_size - guard_used_by_caller;
> >> [...]
> >> > +  if (residual)
> >> > +    {
> >> > +      aarch64_sub_sp (temp1, temp2, residual, frame_related_p);
> >> > +      if (residual >= min_probe_threshold)
> >> > +	{
> >> > +	  if (dump_file)
> >> > +	    fprintf (dump_file,
> >> > +		     "Stack clash AArch64 prologue residuals: "
> >> > +		     HOST_WIDE_INT_PRINT_DEC " bytes, probing will be
> required."
> >> > +		     "\n", residual);
> >> > +	  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
> >> > +					   STACK_CLASH_CALLER_GUARD));
> >>
> >> reg_offsets are nonnegative, so if LR_REGNUM isn't saved at position
> >> 0, min_probe_threshold will be less than STACK_CLASH_CALLER_GUARD.
> >> It looks like the probe would then write above the region.
> >>
> >> Using >= rather than > means that the same thing could happen when
> >> LR_REGNUM is at position 0, if the residual is exactly
> >> STACK_CLASH_CALLER_GUARD.
> >
> > That's true. While addressing this we changed how the residuals are
> probed.
> >
> > To address a comment you raised offline about the saving of LR when
> > calling a no-return function using a tail call and
> > -fomit-frame-pointer, I think this should be safe as the code in
> > frame_layout (line 4131-4136) would ensure that R30 is saved.
> 
> That line number range doesn't seem to match up with current sources.
> But my point was that "X is a non-leaf function" does not directly imply "X
> saves R30_REGNUM".  It might happen to imply that indirectly for all cases at
> the moment, but it's not something that the AArch64 code enforces itself.
> AFAICT the only time we force R30 to be saved explicitly is when emitting a
> chain:
> 
>   if (cfun->machine->frame.emit_frame_chain)
>     {
>       /* FP and LR are placed in the linkage record.  */
>       cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
>       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
>       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
>       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
>       offset = 2 * UNITS_PER_WORD;
>     }
> 
> Otherwise we only save R30_REGNUM if the df machinery says R30 is live.
> And in principle, it should be correct to use a sibcall pattern that doesn't
> clobber R30 for a noreturn call even in functions that require a frame.  We
> don't do that yet, and it probably doesn't make sense from a QoI perspective
> on AArch64, but I don't think it's invalid in terms of rtl semantics.  There's no
> real reason why a noreturn function has to save the address of its caller
> unless the target forces it to, such as for frame chains above.
> 
> In this patch we're relying on the link between the two concepts for a
> security feature, so I think we should either enforce it explicitly or add an
> assert like:
> 
>   gcc_assert (crtl->is_leaf
> 	      || (cfun->machine->frame.reg_offset[R30_REGNUM]
> 		  != SLOT_NOT_REQUIRED));
> 
> to aarch64_expand_prologue.  (I don't think we should make the assert
> dependent on stack-clash, since the point is that we're assuming this
> happens regardless of stack-clash.)

I agree that the assert would be a good idea, though I'm not sure enabling it always is
a good idea. I'm not sure what other languages that don't use stack-clash-protection and do their
own probing do. Like Ada or Go?

Regards,
Tamar

> 
> > I have added two new tests to check for this, so that if it does
> > change in the future they would fail.
> 
> Thanks.
> 
> Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]