GCC Bugzilla – Bug 58245
-fstack-protector[-all] does not protect functions that call noreturn functions
Last modified: 2014-02-17 20:45:57 UTC
This issue is almost identical to bug #23221, but affects functions whose executions end with a call to a noreturn function rather than a tail call. The simplest example is:
When compiled with -fstack-protector-all, the function prologue will read and store the canary, but no check will be made before passing execution to exit.
This is actually a major practical problem for some users of musl libc, because code like the above appears in many configure scripts, and musl libc uses weak symbols so as to avoid initializing stack-protector (and thereby avoid initializing the TLS register) if there is no reference to __stack_chk_fail. Due to this issue, the above code generates thread-pointer-relative (e.g. %fs-based on x86_64) accesses to read the canary, but no reference to __stack_chk_fail, and then crashes when run, leading to spurious configure failures. For the time being, I have informed users who wish to use -fstack-protector-all that they can add -fno-builtin-exit -D__noreturn__= to their CFLAGS, but this is an ugly workaround.
It should be noted that this issue happens even at -O0. I think using noreturn for dead code removal at -O0 is highly undesirable; for instance, it would preclude proper debugging of issues caused by a function erroneously being marked noreturn and actually returning. However that matter probably deserves its own bug report...
One more thing: I would be happy with either of two solutions, either:
(1) Checking the canary before calling a noreturn function, just like performing a check before a tail-call, or
(2) Eliminating the dead-code-removal of the function epilogue at -O0, and for non-zero -O levels, adding an optimization to remove the canary loading from the prologue if no epilogue to check the canary is to be generated.
The best solution: Don't use the same triplet as the GNU (glibc) one. Have musl have its own triplet.
We already do that; the patch is in the musl-cross repo here:
https://bitbucket.org/GregorR/musl-cross or https://github.com/GregorR/musl-cross
However, we want the stack-protector behavior for GCC with musl to be the same as with glibc, using the TLS canary and __stack_chk_fail function in libc rather than a separate libssp. In all real-world, nontrivial code, everything works fine. The only failure of empty programs like the above which just call exit, which, when combined with -fstack-protector-all, cause failure.
In any case, the failure of configure scripts with musl is just one symptom of the problem: useless loads of the canary without a corresponding check of the canary. From a security standpoint, I feel like checking the canary before calling a function that won't return would be the best possible behavior, so that every function gets a check. However, if doing this isn't deemed worthwhile, I think the canary load, which is dead code without a subsequent check, should be optimized out.
the place to get only the GCC patches for musl is https://github.com/GregorR/musl-gcc-patches though.
btw, some debugging has shown that
stack_protect_epilogue() are both called, so it looks as if it's a later pass that decides to eliminate the "dead" code of the epilogue, but not the one of the prologue.
any hints where in the code this could happen is appreciated.
I have the same issue and confirm this issue. Any ideas how to fix it properly?
I don't see how this can be considered a GCC bug.
GCC doesn't check the canary before any other functions in the body of the function, the reason for checking the stack canary is to prevent using overwritten return addresses on the stack, but noreturn function doesn't return, so it is irrelevant if the stack has been overwritten (except for what would matter inside of the function itself, but we don't check it for any other function either in the middle of functions).
I claim it's a bug in that GCC should _either_ check the canary at some point, or eliminate the code that's loading the canary to begin with since it will never be checked.
Gentoo's hardened toolchain is stumbling on this bug and its a blocker for us to get musl + gentoo working nicely. Has there been any movement on it?
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.