[PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

Fri Jun 11 15:49:02 GMT 2021

> On Jun 11, 2021, at 6:12 AM, Richard Biener <rguenther@suse.de> wrote:
> 
> On Thu, 10 Jun 2021, Qing Zhao wrote:
> 
>> Hi, Richard,
>> 
>> I need more discussion on the following comments you raised:
>> 
>>> On May 26, 2021, at 6:18 AM, Richard Biener <rguenther@suse.de> wrote:
>>> 
>>> +/* Expand the IFN_DEFERRED_INIT function according to its second 
>>> argument.  */
>>> +static void
>>> +expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>> +{
>>> +  tree var = gimple_call_lhs (stmt);
>>> +  tree init = NULL_TREE;
>>> +  enum auto_init_type init_type
>>> +    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
>>> +
>>> +  switch (init_type)
>>> +    {
>>> +    default:
>>> +      gcc_unreachable ();
>>> +    case AUTO_INIT_PATTERN:
>>> +      init = build_pattern_cst_for_auto_init (TREE_TYPE (var));
>>> +      expand_assignment (var, init, false);
>>> +      break;
>>> +    case AUTO_INIT_ZERO:
>>> +      init = build_zero_cst (TREE_TYPE (var));
>>> +      expand_assignment (var, init, false);
>>> +      break;
>>> +    }
>>> 
>>> I think actually building build_pattern_cst_for_auto_init can generate
>>> massive garbage and for big auto vars code size is also a concern and
>>> ideally on x86 you'd produce rep movq.  So I don't think going
>>> via expand_assignment is good.  Instead you possibly want to lower
>>> .DEFERRED_INIT to MEMs following expand_builtin_memset and
>>> eventually enhance that to allow storing pieces larger than a byte.
>> 
>> When I tried to lower .DEFERRED_INIT to MEMs for  “AUTO_INIT_PATTERN”, I have the following questions:
>> 
>> 1. If .DEFERRED_INIT will be lowered to MEMS through “memset”, then we basically initialize the whole memory covering the
>> auto variable, including paddings. Right?
> 
> Yes.
> 
>> 2. Only when the value that is used to initialization has a repeated 
>>   byte-pattern, we can lower it through “memset”. Otherwise, If the 
>>   value that is used to initialization does Not have a repeated 
>>   byte-pattern, we can NOT lower it through “memset”, right?
> 
> Yes.  This is why I said you should do it _similar_ to how memcpy
> is implemented.  OTOH I don't see a good reason to support patterns
> that are bigger than a byte ...
> 
>> Currently, for the values that are used to initialize for “AUTO_INIT_PATTERN”, we have:
>> 
>>  /* The following value is a guaranteed unmappable pointer value and has a
>>     repeated byte-pattern which makes it easier to synthesize.  We use it for
>>     pointers as well as integers so that aggregates are likely to be
>>     initialized with this repeated value.  */
>>  uint64_t largevalue = 0xAAAAAAAAAAAAAAAAull;
>>  /* For 32-bit platforms it's a bit trickier because, across systems, only the
>>     zero page can reasonably be expected to be unmapped, and even then we need
>>     a very low address.  We use a smaller value, and that value sadly doesn't
>>     have a repeated byte-pattern.  We don't use it for integers.  */
>>  uint32_t smallvalue = 0x000000AA;
>> 
>> In additional to the above, for BOOLEAN_TYPE:
>> 
>>    case BOOLEAN_TYPE:
>>      /* We think that initializing the boolean variable to 0 other than 1
>>         is better even for pattern initialization.  */
>> 
>> Due to “BOOLEAN_TYPE” and “POINTER_TYPE”, we cannot always have a 
>> repeated byte-pattern for variables that include BOOLEAN_TYPE Or Pointer 
>> types. Therefore, lowering the .DEFERRED_INIT for “PATTERN” 
>> initialization through “memset” is not always possible.
>> 
>> Let me know if I miss anything in the above. Do you have other suggestions?
> 
> The main point is that you need to avoid building the explicit initializer
> only to have it consumed by assignment expansion.  If you want to keep
> all the singing and dancing (as opposed to maybe initializing with a
> 0x1 byte pattern) then I think for efficiency you still want to
> block-initialize the variable and then only fixup the special fields.

Yes, this is a good idea. 

We can memset the whole structure with repeated pattern “0xAA” first,
Then mixup BOOLEAN_TYPE and POINTER TYPE for 32-bit platform. 
That might be more efficient. 

> 
> But as said, all this is quite over-designed IMHO and simply
> zeroing everything would be much simpler and good enough.

So, the fundenmental questions are:

1. do we need the functionality of “Pattern Initialization” for debugging purpose?
I see that other compilers support both Zero initialization and Pattern initialization. (Clang and Microsoft compiler)

http://lists.llvm.org/pipermail/cfe-dev/2020-April/065221.html
https://msrc-blog.microsoft.com/2020/05/13/solving-uninitialized-stack-memory-on-windows/
Pattern init is used in development build for debugging purpose, zero init is used in production build for security purpose.

So, I assume that GCC might want to provide similar functionality?  But I am open on this. 

Kees, will Kernel use “Pattern initialization” feature? 

2. Since “Pattern initialization” is just used for debugging purpose, the runtime and code size overhead might not be that 
Important at all, right?

thanks.

Qing
> 
> Richard.