This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [ARM] Fix PR85434: spill of stack protector's guard address
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Thomas Preudhomme <thomas dot preudhomme at linaro dot org>
- Cc: gcc-patches at gcc dot gnu dot org, Kyrylo Tkachov <Kyrylo dot Tkachov at arm dot com>, Ramana Radhakrishnan <ramana dot radhakrishnan at arm dot com>, Richard Earnshaw <richard dot earnshaw at arm dot com>
- Date: Sat, 28 Apr 2018 12:57:42 +0100
- Subject: Re: [ARM] Fix PR85434: spill of stack protector's guard address
- References: <CAKnkMGureVC-Wf09yG1x70h+LTfh+oQt_e-6nU8P8xkAi2PYAQ@mail.gmail.com>
Thomas Preudhomme <thomas.preudhomme@linaro.org> writes:
> On Arm (Aarch32 and Aarch64) the stack protector's guard is accessed by
> loading its address first before loading its value from it as part of
> the stack_protect_set or stack_protect_check insn pattern. This creates
> the risk of spilling between the two.
>
> It is particularly likely on Aarch32 when compiling PIC code because
> - computing the address takes several instructions (first compute the
> GOT base and then the GOT entry by adding an offset) which increases
> the likelyhood of CSE
> - the address computation can be CSEd due to the GOT entry computation
> being a MEM of the GOT base + an UNSPEC offset rather than an UNSPEC
> of a MEM like on AArche64.
>
> This patch address both issues by (i) adding some scheduler barriers
> around the stack protector code and (ii) making all memory loads
> involved in computing the guard's address volatile. The use of volatile
> rather than unspec was chosen so that the patterns for computing the
> guard address can be the same as for normal global variable access thus
> reusing more code. Finally the patch also improves the documentation to
> mention the need to be careful when computing the address of the guard.
[...]
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index deab929..c7ced8f 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6156,6 +6156,10 @@ stack_protect_prologue (void)
> tree guard_decl = targetm.stack_protect_guard ();
> rtx x, y;
>
> + /* Prevent scheduling of instruction(s) between computation of the guard's
> + address and setting of the canari to avoid any spill of the guard's
> + address if computed outside the setting of the canari. */
> + emit_insn (gen_blockage ());
> x = expand_normal (crtl->stack_protect_guard);
> if (guard_decl)
> y = expand_normal (guard_decl);
Scheduling barriers should definitely make spilling less likely,
but I don't think they avoid it completely. For that I think we'd need to:
(a) make sure that gen_stack_protect_set gets passed (mem (symbol_ref)),
which we can probably do by using DECL_RTL directly. (Or failing that,
expand_expr with EXPAND_CONST_ADDRESS should work.)
(b) make the target pattern accept (mem (symbol_ref)) and only legitimise
it during a post-RA split.
The scheduling barriers would then be unnecessary, since the pattern
would be indivisible and self-contained until after RA.
On its own that would probably mean changing all backends to accept
the new style of mem. One way of avoiding that would be to use the
maybe_expand_insn interface with targetm.code_for_stack_protect_set
instead of calling targetm.gen_stack_protect_set directly.
Arguably that's better anyway.
Thanks,
Richard