[PATCH] reject scalar array initialization with nullptr [PR94510]
Jason Merrill
jason@redhat.com
Tue Apr 14 02:43:20 GMT 2020
On 4/12/20 5:49 PM, Martin Sebor wrote:
> On 4/10/20 8:52 AM, Jason Merrill wrote:
>> On 4/9/20 4:23 PM, Martin Sebor wrote:
>>> On 4/9/20 1:32 PM, Jason Merrill wrote:
>>>> On 4/9/20 3:24 PM, Martin Sebor wrote:
>>>>> On 4/9/20 1:03 PM, Jason Merrill wrote:
>>>>>> On 4/8/20 1:23 PM, Martin Sebor wrote:
>>>>>>> On 4/7/20 3:36 PM, Marek Polacek wrote:
>>>>>>>> On Tue, Apr 07, 2020 at 02:46:52PM -0600, Martin Sebor wrote:
>>>>>>>>> On 4/7/20 1:50 PM, Marek Polacek wrote:
>>>>>>>>>> On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via
>>>>>>>>>> Gcc-patches wrote:
>>>>>>>>>>> Among the numerous regressions introduced by the change
>>>>>>>>>>> committed
>>>>>>>>>>> to GCC 9 to allow string literals as template arguments is a
>>>>>>>>>>> failure
>>>>>>>>>>> to recognize the C++ nullptr and GCC's __null constants as
>>>>>>>>>>> pointers.
>>>>>>>>>>> For one, I didn't realize that nullptr, being a null pointer
>>>>>>>>>>> constant,
>>>>>>>>>>> doesn't have a pointer type, and two, I didn't think of
>>>>>>>>>>> __null (which
>>>>>>>>>>> is a special integer constant that NULL sometimes expands to).
>>>>>>>>>>>
>>>>>>>>>>> The attached patch adjusts the special handling of trailing zero
>>>>>>>>>>> initializers in reshape_init_array_1 to recognize both kinds of
>>>>>>>>>>> constants and avoid treating them as zeros of the array integer
>>>>>>>>>>> element type. This restores the expected diagnostics when
>>>>>>>>>>> either
>>>>>>>>>>> constant is used in the initializer list.
>>>>>>>>>>>
>>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in
>>>>>>>>>>> std::array
>>>>>>>>>>>
>>>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>>>
>>>>>>>>>>> PR c++/94510
>>>>>>>>>>> * decl.c (reshape_init_array_1): Exclude mismatches with
>>>>>>>>>>> all kinds
>>>>>>>>>>> of pointers.
>>>>>>>>>>>
>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>
>>>>>>>>>>> PR c++/94510
>>>>>>>>>>> * g++.dg/init/array57.C: New test.
>>>>>>>>>>> * g++.dg/init/array58.C: New test.
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
>>>>>>>>>>> index a127734af69..692c8ed73f4 100644
>>>>>>>>>>> --- a/gcc/cp/decl.c
>>>>>>>>>>> +++ b/gcc/cp/decl.c
>>>>>>>>>>> @@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type,
>>>>>>>>>>> tree max_index, reshape_iter *d,
>>>>>>>>>>> TREE_CONSTANT (new_init) = false;
>>>>>>>>>>> /* Pointers initialized to strings must be treated
>>>>>>>>>>> as non-zero
>>>>>>>>>>> - even if the string is empty. */
>>>>>>>>>>> + even if the string is empty. Handle all kinds of
>>>>>>>>>>> pointers,
>>>>>>>>>>> + including std::nullptr and GCC's __nullptr, neither of
>>>>>>>>>>> which
>>>>>>>>>>> + has a pointer type. */
>>>>>>>>>>> tree init_type = TREE_TYPE (elt_init);
>>>>>>>>>>> - if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P
>>>>>>>>>>> (init_type)
>>>>>>>>>>> + bool init_is_ptr = (POINTER_TYPE_P (init_type)
>>>>>>>>>>> + || NULLPTR_TYPE_P (init_type)
>>>>>>>>>>> + || null_node_p (elt_init));
>>>>>>>>>>> + if (POINTER_TYPE_P (elt_type) != init_is_ptr
>>>>>>>>>>> || !type_initializer_zero_p (elt_type, elt_init))
>>>>>>>>>>> last_nonzero = index;
>>>>>>>>>>
>>>>>>>>>> It looks like this still won't handle e.g. pointers to member
>>>>>>>>>> functions,
>>>>>>>>>> e.g.
>>>>>>>>>>
>>>>>>>>>> struct S { };
>>>>>>>>>> int arr[3] = { (void (S::*) ()) 0, 0, 0 };
>>>>>>>>>>
>>>>>>>>>> would still be accepted. You could use TYPE_PTR_OR_PTRMEM_P
>>>>>>>>>> instead of
>>>>>>>>>> POINTER_TYPE_P to catch this case.
>>>>>>>>>
>>>>>>>>> Good catch! That doesn't fail because unlike null data member
>>>>>>>>> pointers
>>>>>>>>> which are represented as -1, member function pointers are
>>>>>>>>> represented
>>>>>>>>> as a zero.
>>>>>>>>>
>>>>>>>>> I had looked for an API that would answer the question: "is this
>>>>>>>>> expression a pointer?" without having to think of all the
>>>>>>>>> different
>>>>>>>>> kinds of them but all I could find was null_node_p(). Is this
>>>>>>>>> a rare,
>>>>>>>>> isolated case that having an API like that wouldn't be worth
>>>>>>>>> having
>>>>>>>>> or should I add one like in the attached update?
>>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>
>>>>>>>>> PR c++/94510 - nullptr_t implicitly cast to zero twice in
>>>>>>>>> std::array
>>>>>>>>>
>>>>>>>>> gcc/cp/ChangeLog:
>>>>>>>>>
>>>>>>>>> PR c++/94510
>>>>>>>>> * decl.c (reshape_init_array_1): Exclude mismatches with
>>>>>>>>> all kinds
>>>>>>>>> of pointers.
>>>>>>>>> * gcc/cp/cp-tree.h (null_pointer_constant_p): New function.
>>>>>>>>
>>>>>>>> (Drop the gcc/cp/.)
>>>>>>>>
>>>>>>>>> +/* Returns true if EXPR is a null pointer constant of any
>>>>>>>>> type. */
>>>>>>>>> +
>>>>>>>>> +inline bool
>>>>>>>>> +null_pointer_constant_p (tree expr)
>>>>>>>>> +{
>>>>>>>>> + STRIP_ANY_LOCATION_WRAPPER (expr);
>>>>>>>>> + if (expr == null_node)
>>>>>>>>> + return true;
>>>>>>>>> + tree type = TREE_TYPE (expr);
>>>>>>>>> + if (NULLPTR_TYPE_P (type))
>>>>>>>>> + return true;
>>>>>>>>> + if (POINTER_TYPE_P (type))
>>>>>>>>> + return integer_zerop (expr);
>>>>>>>>> + return null_member_pointer_value_p (expr);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>
>>>>>>>> We already have a null_ptr_cst_p so it would be sort of
>>>>>>>> confusing to have
>>>>>>>> this as well. But are you really interested in whether it's a
>>>>>>>> null pointer,
>>>>>>>> not just a pointer?
>>>>>>>
>>>>>>> The goal of the code is to detect a mismatch in "pointerness"
>>>>>>> between
>>>>>>> an initializer expression and the type of the initialized
>>>>>>> element, so
>>>>>>> it needs to know if the expression is a pointer (non-nulls pointers
>>>>>>> are detected in type_initializer_zero_p). That means testing a
>>>>>>> number
>>>>>>> of IMO unintuitive conditions:
>>>>>>>
>>>>>>> TYPE_PTR_OR_PTRMEM_P (TREE_TYPE (expr))
>>>>>>> || NULLPTR_TYPE_P (TREE_TYPE (expr))
>>>>>>> || null_node_p (expr)
>>>>>>>
>>>>>>> I don't know if this type of a query is common in the C++ FE but
>>>>>>> unless
>>>>>>> this is an isolated use case then besides fixing the bug I
>>>>>>> thought it
>>>>>>> would be nice to make it easier to get the test above right, or
>>>>>>> at least
>>>>>>> come close to it.
>>>>>>>
>>>>>>> Since null_pointer_constant_p already exists (but isn't suitable
>>>>>>> here
>>>>>>> because it returns true for plain literal zeros)
>>>>>>
>>>>>> Why is that unsuitable? A literal zero is a perfectly good
>>>>>> zero-initializer for a pointer.
>>>>>
>>>>> Right, that's why it's not suitable here. Because a literal zero
>>>>> is also not a pointer.
>>>>>
>>>>> The question the code asks is: "is the initializer expression
>>>>> a pointer (of any kind)?"
>>>>
>>>> Why is that a question we want to ask? What we need here is to know
>>>> whether the initializer expression is equivalent to implicit
>>>> zero-initialization. For initializing a pointer, a literal 0 is
>>>> equivalent, so we don't want to update last_nonzero.
>>>
>>> Yes, but that's not the bug we're fixing. The problem occurs with
>>> an integer array and a pointer initializer:
>>>
>>> int a[2] = { nullptr, 0 };
>>
>> Aha, you're fixing a different bug than the one I was seeing.
>
> What is that one? (I'm not aware of any others in this area.)
>
>>
>>> and with elt_type = TREE_TYPE (a) and init_type TREE_TYPE (nullptr)
>>> the test
>>>
>>> POINTER_TYPE_P (elt_type) != POINTER_TYPE_P (init_type)
>>>
>>> evaluates to false because neither type is a pointer type and
>>>
>>> type_initializer_zero_p (elt_type, elt_init)
>>>
>>> returns true because nullptr is zero, and so last_nonzero doesn't
>>> get set, the element gets trimmed, and the invalid initialization
>>> of int with nullptr isn't diagnosed.
>>>
>>> But I'm not sure if you're questioning the current code, the simple
>>> fix quoted above, or my assertion that null_pointer_constant_p would
>>> not be a suitable function to call to tell if an initializer is
>>> nullptr vs plain zero.
>>>
>>>> Also, why is the pointer check here rather than part of the
>>>> POINTER_TYPE_P handling in type_initializer_zero_p?
>>>
>>> type_initializer_zero_p is implemented in terms of initializer_zerop
>>> with the only difference that empty strings are considered to be zero
>>> only for char arrays and not char pointers.
>>
>> Yeah, but that's the fundamental problem: We're assuming that any zero
>> is suitable for initializing any type except for a few exceptions, and
>> adding more exceptions when we find a new testcase that breaks.
>>
>> Handling this in process_init_constructor_array avoids all these
>> problems by looking at the initializers after they've been converted
>> to the desired type, at which point it's much clearer whether they are
>> zero or not; then we don't need type_initializer_zero_p because the
>> initializer already has the proper type and for zero_init_p types we
>> can just use initializer_zero_p.
>
> I've already expressed my concerns with that change but if you are
> comfortable with it I won't insist on waiting until GCC 11. Your last
> request for that patch was to rework the second loop to avoid changing
> the counter of the previous loop. The attached update does that.
>
> I also added another C++ 2a test to exercise a few more cases with
> pointers to members. With it I ran into what looks like an unrelated
> bug in this area. I opened PR 94568 for it, CC'd you, and xfailed
> the problem case in the new test.
>
>>
>> We do probably want some function that tests whether a particular
>> initializer is equivalent to zero-initialization, which is either
>> initializer_zero_p for zero_init_p types, !expr for pointers to
>> members, and recursing for aggregates. Maybe cp_initializer_zero_p or
>> zero_init_expr_p?
>>
>>> It could be changed to return false for incompatible initializers
>>> like pointers (or even __null) for non-pointer types, even if they
>>> are zero, but that's not what it's designed to do.
>>
>> But that's exactly what we did for 90938. Now you're proposing
>> another small exception, only putting it in the caller instead. I
>> think we'll keep running into these problems until we fix the design
>> issue.
>
> Somehow that felt different. But I don't have a problem with moving
> the pointer check there as well. It shouldn't be too much more
> intrusive than the original patch for this bug if you decide to
> go with it for now.
>
>>
>> It would also be possible to improve things by doing the conversion in
>> type_initializer_zero_p before considering its zeroness, but that
>> would again be duplicating work that we're already doing elsewhere.
>
> I agree that it's not worth the trouble given the long-term fix is
> in process_init_constructor_array.
>
> Attached is the updated patch with the process_init_constructor_array
> changes, retested on x86_64-linux.
> + if (!trunc_zero || !type_initializer_zero_p (eltype, ce->value))
> + last_nonzero = i;
I think we can remove type_initializer_zero_p as well, and use
initializer_zerop here.
> + if (last_nonzero < i - 1)
> + {
> + vec_safe_truncate (v, last_nonzero + 1);
This looks like you will never truncate to length 0, which seems like a
problem with last_nonzero being both unsigned and an index; perhaps it
should be something like num_to_keep?
> + len = i = vec_safe_length (v);
> + }
Nitpick: It seems you don't need to update len or i since you're about
to return.
> - else if (TREE_CODE (next) == CONSTRUCTOR
> - && CONSTRUCTOR_PLACEHOLDER_BOUNDARY (next))
> - {
> - /* As above. */
> - CONSTRUCTOR_PLACEHOLDER_BOUNDARY (next) = 0;
> - CONSTRUCTOR_PLACEHOLDER_BOUNDARY (init) = 1;
> - }
This is from the recent fix for 90996, we want to keep it.
> /* Set to the index of the last initializer element whose value
> (either as a single expression or as a repeating range) to
> append to the CONSTRUCTOR. */
> unsigned HOST_WIDE_INT last_to_append = i;
OK, I was wrong, let's go back to modifying i instead of introducing
this variable.
> if (next)
> {
> picflags |= picflag_from_initializer (next);
> if (initializer_constant_valid_p (next, TREE_TYPE (next))
> == null_pointer_node)
> {
> /* If the last distinct explicit initializer value is
> the same as the implicit initializer NEXT built above,
> include the former in the range built below. */
> if (i && next == (*v)[last_distinct].value)
> last_to_append = last_distinct;
>
> break;
> }
>
> CONSTRUCTOR_APPEND_ELT (v, size_int (i), next);
> last_distinct = i;
> }
> else
> {
> /* Don't bother checking all the other elements and avoid
> appending to the initializer list below. */
> last_distinct = i;
> last_to_append = i + 1;
> break;
> }
> if (last_distinct < last_to_append - 1)
Could this be
if (len > i+1 && next)
instead, and never look at last_distinct again? If next is the same as
the last_distinct value, we will have adjusted i to be last_distinct;
otherwise, we still want to start the range with i.
Jason
More information about the Gcc-patches
mailing list