This is the mail archive of the
mailing list for the GCC project.
Re: [musl] musl, glibc and ideal place for __stack_chk_fail_local
- From: Rich Felker <dalias at libc dot org>
- To: Segher Boessenkool <segher at kernel dot crashing dot org>
- Cc: Sergei Trofimovich <slyfox at gentoo dot org>, musl at lists dot openwall dot com, libc-alpha at sourceware dot org, gcc at gcc dot gnu dot org, toolchain at gentoo dot org
- Date: Thu, 30 Jan 2020 08:37:40 -0500
- Subject: Re: [musl] musl, glibc and ideal place for __stack_chk_fail_local
- References: <20200125105331.7c5d284b@sf> <20200125155424.GZ30412@brightrain.aerifal.cx> <20200130123351.GU22482@gate.crashing.org>
On Thu, Jan 30, 2020 at 06:33:51AM -0600, Segher Boessenkool wrote:
> On Sat, Jan 25, 2020 at 10:54:24AM -0500, Rich Felker wrote:
> > > To support smash stack protection gcc emits __stack_chk_fail
> > > calls on all targets. On top of that gcc emits __stack_chk_fail_local
> > > calls at least on i386 and powerpc:
> (Only on 32-bit -fPIC -msecure-plt, for Power).
Right, but musl only supports the secure-plt ABI.
> > There is a half-serious proposal to put it in crti.o which is always
> > linked too, but that seems like an ugly hack to me...
> Not *very* ugly, but it would be very effective, and no real downsides
> to it (or do you see something?)
Well either the thunk has to be written in asm per-arch, or some ld -r
magic (which is fragile and something I don't want musl to depend on
since I know users will someday hit breakage and rightfully blame us
for using ld -r) to merge an asm source and C source. Or perhaps the
existing crti.s content could be moved to file-scope __asm__ included
in the C source file...that might be ok?
> > > My understanding of requirements for libc that exposes ssp support:
> > > - __stack_chk_fail is implemented as a default symbol
> > > - __stack_chk_fail_local is implemented as a local symbol to avoid PLT.
> > > (Why is it important? To avoid use of potentially already broken stack?)
> > Because performance cost of -fstack-protector would go from 1-2% up to
> > 5-10% on i386 and other archs where PLT contract requires a GOT
> > register, since loading the GOT register is expensive
> > (__x86.get_pc_thunk.* thunk itself is somewhat costly, and you throw
> > away one of only a small number of available registers, increasing
> > register pressure and hurting codegen).
> On Power it is just the setting up itself that is costly (in the config
> where we have this _local thing).
I think it'd be the same. If a function otherwise has no reason to
access global data or calls though PLT, it can avoid the cost of
finding the GOT and spending a fixed register on it. But possibility
of having to call __stack_chk_fail makes *every* (stack-protected)
function need to be able to make calls thru PLT, and thus introduces
this cost to every function.
> > Absolutely not. libssp is unsafe and creates new vulns/attack surface
> > by doing introspective stuff after the process is already *known to
> > be* in a compromised state. It should never be used. musl's
> > __stack_chk_fail is safe and terminates immediately.
> Some implementations even print strings from the stack, it can be worse ;-)
> > Ideally, though, GCC would just emit the termination inline (or at
> > least have an option to do so) rather than calling __stack_chk_fail or
> > the local version. This would additionally harden against the case
> > where the GOT is compromised.
> Yeah, but how to terminate is system-specific, it's much easier to punt
> this job to the libc to do ;-)
My ideas was __builtin_trap, although a slightly more hardened version
(that might make users unhappy? :) is inlining a syscall to
sigprocmask to mask SIGILL/SIGSEGV before the trapping instruction so
that termination occurs regardless of whether there's a signal handler
> Open a GCC PR for this please?
Filed as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93509