[PATCH] reject scalar array initialization with nullptr [PR94510]

Jason Merrill jason@redhat.com
Tue Apr 14 02:43:20 GMT 2020


On 4/12/20 5:49 PM, Martin Sebor wrote:
> On 4/10/20 8:52 AM, Jason Merrill wrote:
>> On 4/9/20 4:23 PM, Martin Sebor wrote:
>>> On 4/9/20 1:32 PM, Jason Merrill wrote:
>>>> On 4/9/20 3:24 PM, Martin Sebor wrote:
>>>>> On 4/9/20 1:03 PM, Jason Merrill wrote:
>>>>>> On 4/8/20 1:23 PM, Martin Sebor wrote:
>>>>>>> On 4/7/20 3:36 PM, Marek Polacek wrote:
>>>>>>>> On Tue, Apr 07, 2020 at 02:46:52PM -0600, Martin Sebor wrote:
>>>>>>>>> On 4/7/20 1:50 PM, Marek Polacek wrote:
>>>>>>>>>> On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via 
>>>>>>>>>> Gcc-patches wrote:
>>>>>>>>>>> Among the numerous regressions introduced by the change 
>>>>>>>>>>> committed
>>>>>>>>>>> to GCC 9 to allow string literals as template arguments is a 
>>>>>>>>>>> failure
>>>>>>>>>>> to recognize the C++ nullptr and GCC's __null constants as 
>>>>>>>>>>> pointers.
>>>>>>>>>>> For one, I didn't realize that nullptr, being a null pointer 
>>>>>>>>>>> constant,
>>>>>>>>>>> doesn't have a pointer type, and two, I didn't think of 
>>>>>>>>>>> __null (which
>>>>>>>>>>> is a special integer constant that NULL sometimes expands to).
>>>>>>>>>>>
>>>>>>>>>>> The attached patch adjusts the special handling of trailing zero
>>>>>>>>>>> initializers in reshape_init_array_1 to recognize both kinds of
>>>>>>>>>>> constants and avoid treating them as zeros of the array integer
>>>>>>>>>>> element type.  This restores the expected diagnostics when 
>>>>>>>>>>> either
>>>>>>>>>>> constant is used in the initializer list.
>>>>>>>>>>>
>>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in 
>>>>>>>>>>> std::array
>>>>>>>>>>>
>>>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>>>
>>>>>>>>>>>     PR c++/94510
>>>>>>>>>>>     * decl.c (reshape_init_array_1): Exclude mismatches with 
>>>>>>>>>>> all kinds
>>>>>>>>>>>     of pointers.
>>>>>>>>>>>
>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>
>>>>>>>>>>>     PR c++/94510
>>>>>>>>>>>     * g++.dg/init/array57.C: New test.
>>>>>>>>>>>     * g++.dg/init/array58.C: New test.
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
>>>>>>>>>>> index a127734af69..692c8ed73f4 100644
>>>>>>>>>>> --- a/gcc/cp/decl.c
>>>>>>>>>>> +++ b/gcc/cp/decl.c
>>>>>>>>>>> @@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, 
>>>>>>>>>>> tree max_index, reshape_iter *d,
>>>>>>>>>>>        TREE_CONSTANT (new_init) = false;
>>>>>>>>>>>          /* Pointers initialized to strings must be treated 
>>>>>>>>>>> as non-zero
>>>>>>>>>>> -     even if the string is empty.  */
>>>>>>>>>>> +     even if the string is empty.  Handle all kinds of 
>>>>>>>>>>> pointers,
>>>>>>>>>>> +     including std::nullptr and GCC's __nullptr, neither of 
>>>>>>>>>>> which
>>>>>>>>>>> +     has a pointer type.  */
>>>>>>>>>>>          tree init_type = TREE_TYPE (elt_init);
>>>>>>>>>>> -      if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P 
>>>>>>>>>>> (init_type)
>>>>>>>>>>> +      bool init_is_ptr = (POINTER_TYPE_P (init_type)
>>>>>>>>>>> +              || NULLPTR_TYPE_P (init_type)
>>>>>>>>>>> +              || null_node_p (elt_init));
>>>>>>>>>>> +      if (POINTER_TYPE_P (elt_type) != init_is_ptr
>>>>>>>>>>>          || !type_initializer_zero_p (elt_type, elt_init))
>>>>>>>>>>>        last_nonzero = index;
>>>>>>>>>>
>>>>>>>>>> It looks like this still won't handle e.g. pointers to member 
>>>>>>>>>> functions,
>>>>>>>>>> e.g.
>>>>>>>>>>
>>>>>>>>>> struct S { };
>>>>>>>>>> int arr[3] = { (void (S::*) ()) 0, 0, 0 };
>>>>>>>>>>
>>>>>>>>>> would still be accepted.  You could use TYPE_PTR_OR_PTRMEM_P 
>>>>>>>>>> instead of
>>>>>>>>>> POINTER_TYPE_P to catch this case.
>>>>>>>>>
>>>>>>>>> Good catch!  That doesn't fail because unlike null data member 
>>>>>>>>> pointers
>>>>>>>>> which are represented as -1, member function pointers are 
>>>>>>>>> represented
>>>>>>>>> as a zero.
>>>>>>>>>
>>>>>>>>> I had looked for an API that would answer the question: "is this
>>>>>>>>> expression a pointer?" without having to think of all the 
>>>>>>>>> different
>>>>>>>>> kinds of them but all I could find was null_node_p().  Is this 
>>>>>>>>> a rare,
>>>>>>>>> isolated case that having an API like that wouldn't be worth 
>>>>>>>>> having
>>>>>>>>> or should I add one like in the attached update?
>>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>
>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in 
>>>>>>>>> std::array
>>>>>>>>>
>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>
>>>>>>>>>     PR c++/94510
>>>>>>>>>     * decl.c (reshape_init_array_1): Exclude mismatches with 
>>>>>>>>> all kinds
>>>>>>>>>     of pointers.
>>>>>>>>>     * gcc/cp/cp-tree.h (null_pointer_constant_p): New function.
>>>>>>>>
>>>>>>>> (Drop the gcc/cp/.)
>>>>>>>>
>>>>>>>>> +/* Returns true if EXPR is a null pointer constant of any 
>>>>>>>>> type.  */
>>>>>>>>> +
>>>>>>>>> +inline bool
>>>>>>>>> +null_pointer_constant_p (tree expr)
>>>>>>>>> +{
>>>>>>>>> +  STRIP_ANY_LOCATION_WRAPPER (expr);
>>>>>>>>> +  if (expr == null_node)
>>>>>>>>> +    return true;
>>>>>>>>> +  tree type = TREE_TYPE (expr);
>>>>>>>>> +  if (NULLPTR_TYPE_P (type))
>>>>>>>>> +    return true;
>>>>>>>>> +  if (POINTER_TYPE_P (type))
>>>>>>>>> +    return integer_zerop (expr);
>>>>>>>>> +  return null_member_pointer_value_p (expr);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>
>>>>>>>> We already have a null_ptr_cst_p so it would be sort of 
>>>>>>>> confusing to have
>>>>>>>> this as well.  But are you really interested in whether it's a 
>>>>>>>> null pointer,
>>>>>>>> not just a pointer?
>>>>>>>
>>>>>>> The goal of the code is to detect a mismatch in "pointerness" 
>>>>>>> between
>>>>>>> an initializer expression and the type of the initialized 
>>>>>>> element, so
>>>>>>> it needs to know if the expression is a pointer (non-nulls pointers
>>>>>>> are detected in type_initializer_zero_p).  That means testing a 
>>>>>>> number
>>>>>>> of IMO unintuitive conditions:
>>>>>>>
>>>>>>>    TYPE_PTR_OR_PTRMEM_P (TREE_TYPE (expr))
>>>>>>>    || NULLPTR_TYPE_P (TREE_TYPE (expr))
>>>>>>>    || null_node_p (expr)
>>>>>>>
>>>>>>> I don't know if this type of a query is common in the C++ FE but 
>>>>>>> unless
>>>>>>> this is an isolated use case then besides fixing the bug I 
>>>>>>> thought it
>>>>>>> would be nice to make it easier to get the test above right, or 
>>>>>>> at least
>>>>>>> come close to it.
>>>>>>>
>>>>>>> Since null_pointer_constant_p already exists (but isn't suitable 
>>>>>>> here
>>>>>>> because it returns true for plain literal zeros)
>>>>>>
>>>>>> Why is that unsuitable?  A literal zero is a perfectly good 
>>>>>> zero-initializer for a pointer.
>>>>>
>>>>> Right, that's why it's not suitable here.  Because a literal zero
>>>>> is also not a pointer.
>>>>>
>>>>> The question the code asks is: "is the initializer expression
>>>>> a pointer (of any kind)?"
>>>>
>>>> Why is that a question we want to ask?  What we need here is to know 
>>>> whether the initializer expression is equivalent to implicit 
>>>> zero-initialization.  For initializing a pointer, a literal 0 is 
>>>> equivalent, so we don't want to update last_nonzero.
>>>
>>> Yes, but that's not the bug we're fixing.  The problem occurs with
>>> an integer array and a pointer initializer:
>>>
>>>    int a[2] = { nullptr, 0 };
>>
>> Aha, you're fixing a different bug than the one I was seeing.
> 
> What is that one?  (I'm not aware of any others in this area.)
> 
>>
>>> and with elt_type = TREE_TYPE (a) and init_type TREE_TYPE (nullptr)
>>> the test
>>>
>>>    POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
>>>
>>> evaluates to false because neither type is a pointer type and
>>>
>>>    type_initializer_zero_p (elt_type, elt_init)
>>>
>>> returns true because nullptr is zero, and so last_nonzero doesn't
>>> get set, the element gets trimmed, and the invalid initialization
>>> of int with nullptr isn't diagnosed.
>>>
>>> But I'm not sure if you're questioning the current code, the simple
>>> fix quoted above, or my assertion that null_pointer_constant_p would
>>> not be a suitable function to call to tell if an initializer is
>>> nullptr vs plain zero.
>>>
>>>> Also, why is the pointer check here rather than part of the 
>>>> POINTER_TYPE_P handling in type_initializer_zero_p?
>>>
>>> type_initializer_zero_p is implemented in terms of initializer_zerop
>>> with the only difference that empty strings are considered to be zero
>>> only for char arrays and not char pointers.
>>
>> Yeah, but that's the fundamental problem: We're assuming that any zero 
>> is suitable for initializing any type except for a few exceptions, and 
>> adding more exceptions when we find a new testcase that breaks.
>>
>> Handling this in process_init_constructor_array avoids all these 
>> problems by looking at the initializers after they've been converted 
>> to the desired type, at which point it's much clearer whether they are 
>> zero or not; then we don't need type_initializer_zero_p because the 
>> initializer already has the proper type and for zero_init_p types we 
>> can just use initializer_zero_p.
> 
> I've already expressed my concerns with that change but if you are
> comfortable with it I won't insist on waiting until GCC 11.  Your last
> request for that patch was to rework the second loop to avoid changing
> the counter of the previous loop.  The attached update does that.
> 
> I also added another C++ 2a test to exercise a few more cases with
> pointers to members.  With it I ran into what looks like an unrelated
> bug in this area.  I opened PR 94568 for it, CC'd you, and xfailed
> the problem case in the new test.
> 
>>
>> We do probably want some function that tests whether a particular 
>> initializer is equivalent to zero-initialization, which is either 
>> initializer_zero_p for zero_init_p types, !expr for pointers to 
>> members, and recursing for aggregates.  Maybe cp_initializer_zero_p or 
>> zero_init_expr_p?
>>
>>> It could be changed to return false for incompatible initializers
>>> like pointers (or even __null) for non-pointer types, even if they
>>> are zero, but that's not what it's designed to do.
>>
>> But that's exactly what we did for 90938.  Now you're proposing 
>> another small exception, only putting it in the caller instead.  I 
>> think we'll keep running into these problems until we fix the design 
>> issue.
> 
> Somehow that felt different.  But I don't have a problem with moving
> the pointer check there as well.  It shouldn't be too much more
> intrusive than the original patch for this bug if you decide to
> go with it for now.
> 
>>
>> It would also be possible to improve things by doing the conversion in 
>> type_initializer_zero_p before considering its zeroness, but that 
>> would again be duplicating work that we're already doing elsewhere.
> 
> I agree that it's not worth the trouble given the long-term fix is
> in process_init_constructor_array.
> 
> Attached is the updated patch with the process_init_constructor_array
> changes, retested on x86_64-linux.

> +      if (!trunc_zero || !type_initializer_zero_p (eltype, ce->value))
> +	last_nonzero = i;

I think we can remove type_initializer_zero_p as well, and use 
initializer_zerop here.

> +      if (last_nonzero < i - 1)
> +       {
> +         vec_safe_truncate (v, last_nonzero + 1);

This looks like you will never truncate to length 0, which seems like a 
problem with last_nonzero being both unsigned and an index; perhaps it 
should be something like num_to_keep?

> +         len = i = vec_safe_length (v);
> +       }

Nitpick: It seems you don't need to update len or i since you're about 
to return.

> -	    else if (TREE_CODE (next) == CONSTRUCTOR
> -		     && CONSTRUCTOR_PLACEHOLDER_BOUNDARY (next))
> -	      {
> -		/* As above.  */
> -		CONSTRUCTOR_PLACEHOLDER_BOUNDARY (next) = 0;
> -		CONSTRUCTOR_PLACEHOLDER_BOUNDARY (init) = 1;
> -	      }

This is from the recent fix for 90996, we want to keep it.

>   /* Set to the index of the last initializer element whose value                                              
>      (either as a single expression or as a repeating range) to                                                
>      append to the CONSTRUCTOR.  */
>   unsigned HOST_WIDE_INT last_to_append = i;

OK, I was wrong, let's go back to modifying i instead of introducing 
this variable.

>       if (next)
>         {
>           picflags |= picflag_from_initializer (next);
>           if (initializer_constant_valid_p (next, TREE_TYPE (next))
>               == null_pointer_node)
>             {
>               /* If the last distinct explicit initializer value is                                            
>                  the same as the implicit initializer NEXT built above,                                        
>                  include the former in the range built below.  */
>               if (i && next == (*v)[last_distinct].value)
>                 last_to_append = last_distinct;
> 
>               break;
>             }
> 
>           CONSTRUCTOR_APPEND_ELT (v, size_int (i), next);
>           last_distinct = i;
>         }
>       else
>         {
>           /* Don't bother checking all the other elements and avoid                                            
>              appending to the initializer list below.  */
>           last_distinct = i;
>           last_to_append = i + 1;
>           break;
>         }

>   if (last_distinct < last_to_append - 1)

Could this be

if (len > i+1 && next)

instead, and never look at last_distinct again?  If next is the same as 
the last_distinct value, we will have adjusted i to be last_distinct; 
otherwise, we still want to start the range with i.

Jason



More information about the Gcc-patches mailing list