[PATCH] reject scalar array initialization with nullptr [PR94510]

Jason Merrill jason@redhat.com
Tue Apr 21 20:33:11 GMT 2020


On 4/17/20 5:18 PM, Martin Sebor wrote:
> On 4/17/20 12:19 AM, Jason Merrill wrote:
>> On 4/15/20 1:30 PM, Martin Sebor wrote:
>>> On 4/13/20 8:43 PM, Jason Merrill wrote:
>>>> On 4/12/20 5:49 PM, Martin Sebor wrote:
>>>>> On 4/10/20 8:52 AM, Jason Merrill wrote:
>>>>>> On 4/9/20 4:23 PM, Martin Sebor wrote:
>>>>>>> On 4/9/20 1:32 PM, Jason Merrill wrote:
>>>>>>>> On 4/9/20 3:24 PM, Martin Sebor wrote:
>>>>>>>>> On 4/9/20 1:03 PM, Jason Merrill wrote:
>>>>>>>>>> On 4/8/20 1:23 PM, Martin Sebor wrote:
>>>>>>>>>>> On 4/7/20 3:36 PM, Marek Polacek wrote:
>>>>>>>>>>>> On Tue, Apr 07, 2020 at 02:46:52PM -0600, Martin Sebor wrote:
>>>>>>>>>>>>> On 4/7/20 1:50 PM, Marek Polacek wrote:
>>>>>>>>>>>>>> On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via 
>>>>>>>>>>>>>> Gcc-patches wrote:
>>>>>>>>>>>>>>> Among the numerous regressions introduced by the change 
>>>>>>>>>>>>>>> committed
>>>>>>>>>>>>>>> to GCC 9 to allow string literals as template arguments 
>>>>>>>>>>>>>>> is a failure
>>>>>>>>>>>>>>> to recognize the C++ nullptr and GCC's __null constants 
>>>>>>>>>>>>>>> as pointers.
>>>>>>>>>>>>>>> For one, I didn't realize that nullptr, being a null 
>>>>>>>>>>>>>>> pointer constant,
>>>>>>>>>>>>>>> doesn't have a pointer type, and two, I didn't think of 
>>>>>>>>>>>>>>> __null (which
>>>>>>>>>>>>>>> is a special integer constant that NULL sometimes expands 
>>>>>>>>>>>>>>> to).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The attached patch adjusts the special handling of 
>>>>>>>>>>>>>>> trailing zero
>>>>>>>>>>>>>>> initializers in reshape_init_array_1 to recognize both 
>>>>>>>>>>>>>>> kinds of
>>>>>>>>>>>>>>> constants and avoid treating them as zeros of the array 
>>>>>>>>>>>>>>> integer
>>>>>>>>>>>>>>> element type.  This restores the expected diagnostics 
>>>>>>>>>>>>>>> when either
>>>>>>>>>>>>>>> constant is used in the initializer list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in 
>>>>>>>>>>>>>>> std::array
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     PR c++/94510
>>>>>>>>>>>>>>>     * decl.c (reshape_init_array_1): Exclude mismatches 
>>>>>>>>>>>>>>> with all kinds
>>>>>>>>>>>>>>>     of pointers.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     PR c++/94510
>>>>>>>>>>>>>>>     * g++.dg/init/array57.C: New test.
>>>>>>>>>>>>>>>     * g++.dg/init/array58.C: New test.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
>>>>>>>>>>>>>>> index a127734af69..692c8ed73f4 100644
>>>>>>>>>>>>>>> --- a/gcc/cp/decl.c
>>>>>>>>>>>>>>> +++ b/gcc/cp/decl.c
>>>>>>>>>>>>>>> @@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree 
>>>>>>>>>>>>>>> elt_type, tree max_index, reshape_iter *d,
>>>>>>>>>>>>>>>        TREE_CONSTANT (new_init) = false;
>>>>>>>>>>>>>>>          /* Pointers initialized to strings must be 
>>>>>>>>>>>>>>> treated as non-zero
>>>>>>>>>>>>>>> -     even if the string is empty.  */
>>>>>>>>>>>>>>> +     even if the string is empty.  Handle all kinds of 
>>>>>>>>>>>>>>> pointers,
>>>>>>>>>>>>>>> +     including std::nullptr and GCC's __nullptr, neither 
>>>>>>>>>>>>>>> of which
>>>>>>>>>>>>>>> +     has a pointer type.  */
>>>>>>>>>>>>>>>          tree init_type = TREE_TYPE (elt_init);
>>>>>>>>>>>>>>> -      if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P 
>>>>>>>>>>>>>>> (init_type)
>>>>>>>>>>>>>>> +      bool init_is_ptr = (POINTER_TYPE_P (init_type)
>>>>>>>>>>>>>>> +              || NULLPTR_TYPE_P (init_type)
>>>>>>>>>>>>>>> +              || null_node_p (elt_init));
>>>>>>>>>>>>>>> +      if (POINTER_TYPE_P (elt_type) != init_is_ptr
>>>>>>>>>>>>>>>          || !type_initializer_zero_p (elt_type, elt_init))
>>>>>>>>>>>>>>>        last_nonzero = index;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like this still won't handle e.g. pointers to 
>>>>>>>>>>>>>> member functions,
>>>>>>>>>>>>>> e.g.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> struct S { };
>>>>>>>>>>>>>> int arr[3] = { (void (S::*) ()) 0, 0, 0 };
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> would still be accepted.  You could use 
>>>>>>>>>>>>>> TYPE_PTR_OR_PTRMEM_P instead of
>>>>>>>>>>>>>> POINTER_TYPE_P to catch this case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Good catch!  That doesn't fail because unlike null data 
>>>>>>>>>>>>> member pointers
>>>>>>>>>>>>> which are represented as -1, member function pointers are 
>>>>>>>>>>>>> represented
>>>>>>>>>>>>> as a zero.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had looked for an API that would answer the question: "is 
>>>>>>>>>>>>> this
>>>>>>>>>>>>> expression a pointer?" without having to think of all the 
>>>>>>>>>>>>> different
>>>>>>>>>>>>> kinds of them but all I could find was null_node_p().  Is 
>>>>>>>>>>>>> this a rare,
>>>>>>>>>>>>> isolated case that having an API like that wouldn't be 
>>>>>>>>>>>>> worth having
>>>>>>>>>>>>> or should I add one like in the attached update?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>
>>>>>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in 
>>>>>>>>>>>>> std::array
>>>>>>>>>>>>>
>>>>>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     PR c++/94510
>>>>>>>>>>>>>     * decl.c (reshape_init_array_1): Exclude mismatches 
>>>>>>>>>>>>> with all kinds
>>>>>>>>>>>>>     of pointers.
>>>>>>>>>>>>>     * gcc/cp/cp-tree.h (null_pointer_constant_p): New 
>>>>>>>>>>>>> function.
>>>>>>>>>>>>
>>>>>>>>>>>> (Drop the gcc/cp/.)
>>>>>>>>>>>>
>>>>>>>>>>>>> +/* Returns true if EXPR is a null pointer constant of any 
>>>>>>>>>>>>> type.  */
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +inline bool
>>>>>>>>>>>>> +null_pointer_constant_p (tree expr)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +  STRIP_ANY_LOCATION_WRAPPER (expr);
>>>>>>>>>>>>> +  if (expr == null_node)
>>>>>>>>>>>>> +    return true;
>>>>>>>>>>>>> +  tree type = TREE_TYPE (expr);
>>>>>>>>>>>>> +  if (NULLPTR_TYPE_P (type))
>>>>>>>>>>>>> +    return true;
>>>>>>>>>>>>> +  if (POINTER_TYPE_P (type))
>>>>>>>>>>>>> +    return integer_zerop (expr);
>>>>>>>>>>>>> +  return null_member_pointer_value_p (expr);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>
>>>>>>>>>>>> We already have a null_ptr_cst_p so it would be sort of 
>>>>>>>>>>>> confusing to have
>>>>>>>>>>>> this as well.  But are you really interested in whether it's 
>>>>>>>>>>>> a null pointer,
>>>>>>>>>>>> not just a pointer?
>>>>>>>>>>>
>>>>>>>>>>> The goal of the code is to detect a mismatch in "pointerness" 
>>>>>>>>>>> between
>>>>>>>>>>> an initializer expression and the type of the initialized 
>>>>>>>>>>> element, so
>>>>>>>>>>> it needs to know if the expression is a pointer (non-nulls 
>>>>>>>>>>> pointers
>>>>>>>>>>> are detected in type_initializer_zero_p).  That means testing 
>>>>>>>>>>> a number
>>>>>>>>>>> of IMO unintuitive conditions:
>>>>>>>>>>>
>>>>>>>>>>>    TYPE_PTR_OR_PTRMEM_P (TREE_TYPE (expr))
>>>>>>>>>>>    || NULLPTR_TYPE_P (TREE_TYPE (expr))
>>>>>>>>>>>    || null_node_p (expr)
>>>>>>>>>>>
>>>>>>>>>>> I don't know if this type of a query is common in the C++ FE 
>>>>>>>>>>> but unless
>>>>>>>>>>> this is an isolated use case then besides fixing the bug I 
>>>>>>>>>>> thought it
>>>>>>>>>>> would be nice to make it easier to get the test above right, 
>>>>>>>>>>> or at least
>>>>>>>>>>> come close to it.
>>>>>>>>>>>
>>>>>>>>>>> Since null_pointer_constant_p already exists (but isn't 
>>>>>>>>>>> suitable here
>>>>>>>>>>> because it returns true for plain literal zeros)
>>>>>>>>>>
>>>>>>>>>> Why is that unsuitable?  A literal zero is a perfectly good 
>>>>>>>>>> zero-initializer for a pointer.
>>>>>>>>>
>>>>>>>>> Right, that's why it's not suitable here.  Because a literal zero
>>>>>>>>> is also not a pointer.
>>>>>>>>>
>>>>>>>>> The question the code asks is: "is the initializer expression
>>>>>>>>> a pointer (of any kind)?"
>>>>>>>>
>>>>>>>> Why is that a question we want to ask?  What we need here is to 
>>>>>>>> know whether the initializer expression is equivalent to 
>>>>>>>> implicit zero-initialization.  For initializing a pointer, a 
>>>>>>>> literal 0 is equivalent, so we don't want to update last_nonzero.
>>>>>>>
>>>>>>> Yes, but that's not the bug we're fixing.  The problem occurs with
>>>>>>> an integer array and a pointer initializer:
>>>>>>>
>>>>>>>    int a[2] = { nullptr, 0 };
>>>>>>
>>>>>> Aha, you're fixing a different bug than the one I was seeing.
>>>>>
>>>>> What is that one?  (I'm not aware of any others in this area.)
>>>>>
>>>>>>
>>>>>>> and with elt_type = TREE_TYPE (a) and init_type TREE_TYPE (nullptr)
>>>>>>> the test
>>>>>>>
>>>>>>>    POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
>>>>>>>
>>>>>>> evaluates to false because neither type is a pointer type and
>>>>>>>
>>>>>>>    type_initializer_zero_p (elt_type, elt_init)
>>>>>>>
>>>>>>> returns true because nullptr is zero, and so last_nonzero doesn't
>>>>>>> get set, the element gets trimmed, and the invalid initialization
>>>>>>> of int with nullptr isn't diagnosed.
>>>>>>>
>>>>>>> But I'm not sure if you're questioning the current code, the simple
>>>>>>> fix quoted above, or my assertion that null_pointer_constant_p would
>>>>>>> not be a suitable function to call to tell if an initializer is
>>>>>>> nullptr vs plain zero.
>>>>>>>
>>>>>>>> Also, why is the pointer check here rather than part of the 
>>>>>>>> POINTER_TYPE_P handling in type_initializer_zero_p?
>>>>>>>
>>>>>>> type_initializer_zero_p is implemented in terms of initializer_zerop
>>>>>>> with the only difference that empty strings are considered to be 
>>>>>>> zero
>>>>>>> only for char arrays and not char pointers.
>>>>>>
>>>>>> Yeah, but that's the fundamental problem: We're assuming that any 
>>>>>> zero is suitable for initializing any type except for a few 
>>>>>> exceptions, and adding more exceptions when we find a new testcase 
>>>>>> that breaks.
>>>>>>
>>>>>> Handling this in process_init_constructor_array avoids all these 
>>>>>> problems by looking at the initializers after they've been 
>>>>>> converted to the desired type, at which point it's much clearer 
>>>>>> whether they are zero or not; then we don't need 
>>>>>> type_initializer_zero_p because the initializer already has the 
>>>>>> proper type and for zero_init_p types we can just use 
>>>>>> initializer_zero_p.
>>>>>
>>>>> I've already expressed my concerns with that change but if you are
>>>>> comfortable with it I won't insist on waiting until GCC 11.  Your last
>>>>> request for that patch was to rework the second loop to avoid changing
>>>>> the counter of the previous loop.  The attached update does that.
>>>>>
>>>>> I also added another C++ 2a test to exercise a few more cases with
>>>>> pointers to members.  With it I ran into what looks like an unrelated
>>>>> bug in this area.  I opened PR 94568 for it, CC'd you, and xfailed
>>>>> the problem case in the new test.
>>>>>
>>>>>>
>>>>>> We do probably want some function that tests whether a particular 
>>>>>> initializer is equivalent to zero-initialization, which is either 
>>>>>> initializer_zero_p for zero_init_p types, !expr for pointers to 
>>>>>> members, and recursing for aggregates.  Maybe 
>>>>>> cp_initializer_zero_p or zero_init_expr_p?
>>>>>>
>>>>>>> It could be changed to return false for incompatible initializers
>>>>>>> like pointers (or even __null) for non-pointer types, even if they
>>>>>>> are zero, but that's not what it's designed to do.
>>>>>>
>>>>>> But that's exactly what we did for 90938.  Now you're proposing 
>>>>>> another small exception, only putting it in the caller instead.  I 
>>>>>> think we'll keep running into these problems until we fix the 
>>>>>> design issue.
>>>>>
>>>>> Somehow that felt different.  But I don't have a problem with moving
>>>>> the pointer check there as well.  It shouldn't be too much more
>>>>> intrusive than the original patch for this bug if you decide to
>>>>> go with it for now.
>>>>>
>>>>>>
>>>>>> It would also be possible to improve things by doing the 
>>>>>> conversion in type_initializer_zero_p before considering its 
>>>>>> zeroness, but that would again be duplicating work that we're 
>>>>>> already doing elsewhere.
>>>>>
>>>>> I agree that it's not worth the trouble given the long-term fix is
>>>>> in process_init_constructor_array.
>>>>>
>>>>> Attached is the updated patch with the process_init_constructor_array
>>>>> changes, retested on x86_64-linux.
>>>>
>>>>> +      if (!trunc_zero || !type_initializer_zero_p (eltype, 
>>>>> ce->value))
>>>>> +    last_nonzero = i;
>>>>
>>>> I think we can remove type_initializer_zero_p as well, and use 
>>>> initializer_zerop here.
>>>>
>>>>> +      if (last_nonzero < i - 1)
>>>>> +       {
>>>>> +         vec_safe_truncate (v, last_nonzero + 1);
>>>>
>>>> This looks like you will never truncate to length 0, which seems 
>>>> like a problem with last_nonzero being both unsigned and an index; 
>>>> perhaps it should be something like num_to_keep?
>>>
>>> This whole block appears to serve no real purpose.  It trims trailing
>>> zeros only from arithmetic types, but the trimming only matters for
>>> pointers to members and that's done later.  I've removed it.
>>
>> Why doesn't it matter for arithmetic types?  Because mangling omits 
>> trailing initializer_zerop elements, so the mangling is the same 
>> either way, and comparing class-type template arguments is based on 
>> mangling?
>>
>> In that case, why don't we do the same for pointers to members?
> 
> Your little patch seems to work for the problems we're fixing (and
> even fixes a subset of pr94568 that I mentioned), which is great.
> But it fails two tests:
> 
> !  FAIL: g++.dg/abi/mangle71.C (3: +3)
> !  FAIL: g++.dg/abi/mangle72.C (10: +10)
> 
> I don't think the mangling of these things is specified yet so maybe
> that's okay (I mostly guessed when I wrote those tests, and could
> have very well guessed wrong).  Here's one difference in mangle71.C:
> 
>    void f0__ (X<B{{ 0 }}>) { }
> 
> mangles as
> 
>    _Z4f0__1XIXtl1BtlA3_1AtlS1_Lc0EEtlS1_Lc1EEEEEE
> 
> by trunk but as
> 
>    _Z4f0__1XIXtl1BtlA3_1AtlS1_EtlS1_Lc1EEEEEE
> 
> with your patch.  I'd say the former is better but I'm not sure.

I agree.

> The changes in mangle72.C, for example for
> 
>    void g__ (Y<B{{ }}>) { }
>    void g0_ (Y<B{{ 0 }}>) { }
>    void g00 (Y<B{{ 0, 0 }}>) { }
> 
> from
> 
>    _Z3g__1YIXtl1BtlA2_M1AA2_iLS3_0EEEEE
>    _Z3g0_1YIXtl1BtlA2_M1AA2_iLS3_0EEEEE
>    _Z3g001YIXtl1BtlA2_M1AA2_iLS3_0EEEEE
> 
> to
> 
>    _Z3g__1YIXtl1BEEE
>    _Z3g0_1YIXtl1BEEE
>    _Z3g001YIXtl1BEEE
> 
> look like improvements, and the one for
> 
>    void g0x (Y<B{{ 0, &A::a }}>) { }
> 
> from
> 
>    _Z3g0x1YIXtl1BtlA2_M1AA2_iLS3_0EEEEE
> 
> to
> 
>    _Z3g0x1YIXtl1BtlA2_M1AA2_iLS3_0ELS3_0EEEEE
> 
> looks like it might actually fix a bug (does it?).  There are
> a few others for member pointers that are similar to the above.

Yes.

> If these mangling changes look okay to you I'm much more comfortable
> with a simple, targeted fix like this than with messing around with
> all of array initialization.

Agreed.

> Assuming the fix is correct, what bothers me is that it implies
> that all these problems have been caused by forgetting to consider
> pointers to members in the fix for pr89833 in r270155, and that all
> this time since then I've been barking up the wrong tree by patching
> up the wrong functions.  I.e., that none of the stripping of
> the trailing initializers outside of mangle.c is or ever was
> necessary.  Why did neither of us realize this a year ago?

Yeah, that is frustrating.

I fixed the mangle71.C issue by not doing any pruning for a non-trivial 
type.  I also fixed another bug whereby the first function with a 
particular pointer to member in its signature (i.e. g0x) got mangled 
properly, but the next did not.  And realized that our mangling of class 
non-type template args still needs a lot of work, but that will wait.

Any comments before I check this in?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 94510.diff
Type: text/x-patch
Size: 26015 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200421/a3d3a0ed/attachment-0001.bin>


More information about the Gcc-patches mailing list