Bug 58245 - -fstack-protector[-all] does not protect functions that call noreturn functions
Summary: -fstack-protector[-all] does not protect functions that call noreturn functions
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-27 02:06 UTC by Rich Felker
Modified: 2014-02-17 20:45 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rich Felker 2013-08-27 02:06:04 UTC
This issue is almost identical to bug #23221, but affects functions whose executions end with a call to a noreturn function rather than a tail call. The simplest example is:

#include <stdlib.h>
int main()
{
    exit(0);
}

When compiled with -fstack-protector-all, the function prologue will read and store the canary, but no check will be made before passing execution to exit.

This is actually a major practical problem for some users of musl libc, because code like the above appears in many configure scripts, and musl libc uses weak symbols so as to avoid initializing stack-protector (and thereby avoid initializing the TLS register) if there is no reference to __stack_chk_fail. Due to this issue, the above code generates thread-pointer-relative (e.g. %fs-based on x86_64) accesses to read the canary, but no reference to __stack_chk_fail, and then crashes when run, leading to spurious configure failures. For the time being, I have informed users who wish to use -fstack-protector-all that they can add -fno-builtin-exit -D__noreturn__= to their CFLAGS, but this is an ugly workaround.

It should be noted that this issue happens even at -O0. I think using noreturn for dead code removal at -O0 is highly undesirable; for instance, it would preclude proper debugging of issues caused by a function erroneously being marked noreturn and actually returning. However that matter probably deserves its own bug report...
Comment 1 Rich Felker 2013-08-27 02:08:21 UTC
One more thing: I would be happy with either of two solutions, either:

(1) Checking the canary before calling a noreturn function, just like performing a check before a tail-call, or

(2) Eliminating the dead-code-removal of the function epilogue at -O0, and for non-zero -O levels, adding an optimization to remove the canary loading from the prologue if no epilogue to check the canary is to be generated.
Comment 2 Andrew Pinski 2013-08-27 02:23:40 UTC
The best solution: Don't use the same triplet as the GNU (glibc) one.  Have musl have its own triplet.
Comment 3 Rich Felker 2013-08-27 02:35:15 UTC
We already do that; the patch is in the musl-cross repo here:

https://bitbucket.org/GregorR/musl-cross or https://github.com/GregorR/musl-cross

However, we want the stack-protector behavior for GCC with musl to be the same as with glibc, using the TLS canary and __stack_chk_fail function in libc rather than a separate libssp. In all real-world, nontrivial code, everything works fine. The only failure of empty programs like the above which just call exit, which, when combined with -fstack-protector-all, cause failure.

In any case, the failure of configure scripts with musl is just one symptom of the problem: useless loads of the canary without a corresponding check of the canary. From a security standpoint, I feel like checking the canary before calling a function that won't return would be the best possible behavior, so that every function gets a check. However, if doing this isn't deemed worthwhile, I think the canary load, which is dead code without a subsequent check, should be optimized out.
Comment 4 Rose Garcia 2013-08-28 13:10:08 UTC
the place to get only the GCC patches for musl is https://github.com/GregorR/musl-gcc-patches though.


btw, some debugging has shown that
stack_protect_prologue() and
stack_protect_epilogue() are both called, so it looks as if it's a later pass that decides to eliminate the "dead" code of the epilogue, but not the one of the prologue.

any hints where in the code this could happen is appreciated.
Comment 5 Timo Teräs 2013-10-01 14:07:31 UTC
I have the same issue and confirm this issue. Any ideas how to fix it properly?
Comment 6 Jakub Jelinek 2013-10-01 15:44:10 UTC
I don't see how this can be considered a GCC bug.
GCC doesn't check the canary before any other functions in the body of the function, the reason for checking the stack canary is to prevent using overwritten return addresses on the stack, but noreturn function doesn't return, so it is irrelevant if the stack has been overwritten (except for what would matter inside of the function itself, but we don't check it for any other function either in the middle of functions).
Comment 7 Rich Felker 2013-10-01 17:18:23 UTC
I claim it's a bug in that GCC should _either_ check the canary at some point, or eliminate the code that's loading the canary to begin with since it will never be checked.
Comment 8 Anthony G. Basile 2014-01-15 14:24:46 UTC
Gentoo's hardened toolchain is stumbling on this bug and its a blocker for us to get musl + gentoo working nicely.  Has there been any movement on it?
Comment 9 Jackie Rosen 2014-02-16 13:15:40 UTC Comment hidden (spam)