Existing security practices recommend to the arrays of automatic storage duration (e.g. by zeroing them) upon function exit.
This could be done by calling memset; however, gcc seems to optimize out the call to memset before the return statement (or when the memset call is the last statement). This forces secure-sensitive applications to implement their own memset, usually a copy of it.
I suggest the following enhancement:
-provide two new attributes: 'clear_stack' and 'allow_ending_memset'
-provide two new flags: -fclear-stack and -Wdirty-stack
-logic: by using -fclear-stack, the following modes can be specified:
-fclear-stack=none: current behavior, memset is optimized out
-fclear-stack=attribute: user controls the behavior per function basis by using the attributes; 'clear_stack' causes gcc to add the memset call at the end of the function (no control flow analysis recommended), whereas 'allow_ending_memset' prevents gcc to optimize out the call to memset enabling the user to call it. Specifying both attributes in the same function should not be allowed.
-fclear-stack=auto: instructs gcc to emit a call to memset at the end of functions having arrays of automatic storage duration (zeroing those arrays only). The 'clear_stack' attribute can be used in this mode to force the stack zeroing on particular functions overriding the decision logic
-fclear-stack=always: instructs gcc to emit a call to memset at the end of every function having a nonempty stack.
-Wdirty-stack: only to be used with -fclear-stack=attribute, causes gcc to emit a warning message when a function has at least an array of static storage duration but is not zeroed at the end (either because 'clear_stack' wasn't specified or because there is no memset call statement; control flow analysis similar to the one used by detecting paths with no return statement on non void-return functions could be used).
Please assign this to email@example.com
I think what is more reasonable is just adding a function attribute clear_stack
that makes sure stack adjustments clear no longer necessary stack slots _plus_
not use the red zone if available (or clear that as well).
Such attribute must also make the function not considered for inlining.
Nothing specific to "memset" should be implemented.
To allow inlining eventually the inliner would have to propagate the clearing
requirement up to the functions inlined to.
Note that secure stack does not only mean to clear local arrays as a programmer
would be able to do with memset (if the compiler were not optimizing that away)
but also ensuring that all spill slots are cleared as (old) register values
carrying secure information may leak to them otherwise.
It should also clear all call clobbered registers at the end of the function that could be touched by the function (or all, if it calls other functions).
Those are very good ideas but fell into the land of the backend. The red zone IIUC is a x86_64 only ABI concept.
What do you think about adding also a -m counterpart option with the same semantic but for the backend?
Our plan is to address separately the middle end part and the backend part (initially for x86_64) and subdivide the latter in clobbered registers and red zone cleaning, so 3 sets of patches.
I think you should read the discussion starting at
<https://gcc.gnu.org/ml/gcc/2015-09/msg00135.html> and see how your ideas
compare to it, then maybe try to restart that discussion and get consensus
on a suitable API approach.
From a user's perspective, would this be better as a property of the data (or of its *type*), rather than of the function? i.e. have the user mark the on-stack buffer as security-sensitive, rather than mark the function as a whole?
i.e. something like
char __attribute__((security_sensitive)) buf;
Then the compiler could:
(a) "do the right thing" for any functions containing such data: e.g. automatically clear the array after the last use, and
(b) issue an error if the user tries to create a global variable of such a type, and
(c) potentially suppress various optimizations on the data
(In reply to David Malcolm from comment #5)
> From a user's perspective, would this be better as a property of the data
> (or of its *type*), rather than of the function? i.e. have the user mark
> the on-stack buffer as security-sensitive, rather than mark the function as
> a whole?
> i.e. something like
> char __attribute__((security_sensitive)) buf;
> Then the compiler could:
> (a) "do the right thing" for any functions containing such data: e.g.
> automatically clear the array after the last use, and
> (b) issue an error if the user tries to create a global variable of such a
> type, and
> (c) potentially suppress various optimizations on the data
But even if you clear the sensitive data from the stack array, it might still live in the registers from which you stored the sensitive data into that array etc. I think per-function is better here over per-data.
(In reply to Jakub Jelinek from comment #6)
> But even if you clear the sensitive data from the stack array, it might
> still live in the registers from which you stored the sensitive data into
> that array etc. I think per-function is better here over per-data.
True. My thinking is that the user marks the type as security sensitive, and then any function containing locals of that type is automatically flagged for special cleanup. But maybe that's not enough fine-grained control? By having the user mark the data itself, we could potentially check more things e.g. issue errors/warnings if the data gets passed to anything other than a whitelist of functions, maybe.
Created attachment 37971 [details]
An implementation for x86 security_sensitive function attribute
Hi, we made this patch, that adds an attribute 'security_sensitive' only for x86_64 arch. But we wanted your opinion about some decisions we made.
When you use a function with this new attribute, it does two things in the function's epilogue:
clears all the used stack of the current function.
clears all the regs that this function uses and aren't preserved or used to return a value from the function.
To clear the stack we use the stack_pointer_offset member of the current frame. I think this isn't handling the case when we use the stack to pass arguments to another function. Is this correct? Should this attribute also clean that part of the stack? Where can we look the stack used for that purpose? Should we force maccumulate-outgoing-args as suggested?
We also disabled the using of the red zone for the function with the attribute, but we were thinking that this may not be necessary, since we can also clear that part of the stack. What do you think?
We chose to clear the stack in a point of the epilogue where the stack pointer is set in the return address. This way we can clear almost all the stack. But we are clearing parts that aren't 'covered' by the current sp. Is this Ok? or should we clear the stack while the sp is at the bottom of the frame?
We decided not to clear the registers when a sibcall is going to be made. Do you think this is correct?
We needed to know which registers are used to return something, so we used the ix86_function_value function. Is this approach correct or is this information somewhere else?
We aren't handling the case when a function doesn't use its arguments. in those cases we aren't clearing the args registers or stack. Do you think this is correct?
We tried to add the noinline attribute when the security_sensitive attribute is set, but it didn't work so we used DECL_UNINLINABLE directly. Is this OK? if not, How do you add another attribute when you 'parse' an attribute?
How does shrink-wrapping affect us? Since in the mailing list somebody mentioned we should disable it.
Any other question or problem you see please let us know. Thanks, and sorry for the long post.
Created attachment 37972 [details]
Crude test case
In case it's helpful, here's a crude attempt at a black-box test case for this: read a secret into an auto buffer in one function, and then see if it's readable in another function.
Currently fails with exit code 2 on trunk (with warning ‘security_sensitive’ attribute directive ignored [-Wattributes])
(Naturally we'd want white-box testing also, to verify that the generated code contains zeroing code).
Thanks for the test, we forgot to append some test too. It worked ok. if you want you can apply that patch in gcc 5.2.0.
Any update on this?
Did this patch ever get posted to firstname.lastname@example.org? That's probably the best way to get some traction on this. Note that feature-freeze for GCC 8 is coming in the next few weeks, AIUI.