[PATCH v3] c++: Further tweaks for new-expression and paren-init [PR77841]

Jason Merrill jason@redhat.com
Wed Sep 9 21:43:13 GMT 2020


On 9/9/20 5:35 PM, Marek Polacek wrote:
> On Wed, Sep 09, 2020 at 05:02:24PM -0400, Jason Merrill wrote:
>> On 9/8/20 10:34 PM, Marek Polacek wrote:
>>> On Tue, Sep 08, 2020 at 04:19:42PM -0400, Jason Merrill wrote:
>>>> On 9/8/20 4:06 PM, Marek Polacek wrote:
>>>>> On Mon, Sep 07, 2020 at 11:19:47PM -0400, Jason Merrill wrote:
>>>>>> On 9/6/20 11:34 AM, Marek Polacek wrote:
>>>>>>> @@ -3944,9 +3935,9 @@ build_new (location_t loc, vec<tree, va_gc> **placement, tree type,
>>>>>>>          }
>>>>>>>        /* P1009: Array size deduction in new-expressions.  */
>>>>>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>>>>>> -      && !TYPE_DOMAIN (type)
>>>>>>> -      && *init)
>>>>>>> +  const bool deduce_array_p = (TREE_CODE (type) == ARRAY_TYPE
>>>>>>> +			       && !TYPE_DOMAIN (type));
>>>>>>> +  if (*init && (deduce_array_p || (nelts && cxx_dialect >= cxx20)))
>>>>>>
>>>>>> Looks like this won't handle new (char[4]), for which we also get an
>>>>>> ARRAY_TYPE.
>>>>>
>>>>> Good catch.  Fixed & paren-init37.C added.
>>>>>
>>>>>>>          {
>>>>>>>            /* This means we have 'new T[]()'.  */
>>>>>>>            if ((*init)->is_empty ())
>>>>>>> @@ -3955,16 +3946,20 @@ build_new (location_t loc, vec<tree, va_gc> **placement, tree type,
>>>>>>>      	  CONSTRUCTOR_IS_DIRECT_INIT (ctor) = true;
>>>>>>>      	  vec_safe_push (*init, ctor);
>>>>>>>      	}
>>>>>>> +      tree array_type = deduce_array_p ? TREE_TYPE (type) : type;
>>>>>>
>>>>>> I'd call this variable elt_type.
>>>>>
>>>>> Right, and it should be inside the block below.
>>>>>
>>>>>>>            tree &elt = (**init)[0];
>>>>>>>            /* The C++20 'new T[](e_0, ..., e_k)' case allowed by P0960.  */
>>>>>>>            if (!DIRECT_LIST_INIT_P (elt) && cxx_dialect >= cxx20)
>>>>>>>      	{
>>>>>>> -	  /* Handle new char[]("foo").  */
>>>>>>> +	  /* Handle new char[]("foo"): turn it into new char[]{"foo"}.  */
>>>>>>>      	  if (vec_safe_length (*init) == 1
>>>>>>> -	      && char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (type)))
>>>>>>> +	      && char_type_p (TYPE_MAIN_VARIANT (array_type))
>>>>>>>      	      && TREE_CODE (tree_strip_any_location_wrapper (elt))
>>>>>>>      		 == STRING_CST)
>>>>>>> -	    /* Leave it alone: the string should not be wrapped in {}.  */;
>>>>>>> +	    {
>>>>>>> +	      elt = build_constructor_single (init_list_type_node, NULL_TREE, elt);
>>>>>>> +	      CONSTRUCTOR_IS_DIRECT_INIT (elt) = true;
>>>>>>> +	    }
>>>>>>>      	  else
>>>>>>>      	    {
>>>>>>>      	      tree ctor = build_constructor_from_vec (init_list_type_node, *init);
>>>>>>
>>>>>> With this change, doesn't the string special case produce the same result as
>>>>>> the general case?
>>>>>
>>>>> The problem is that reshape_init won't do anything for CONSTRUCTOR_IS_PAREN_INIT.
>>>>
>>>> Ah, yes, that flag is the difference.
>>>>
>>>>> So the reshape_init in build_new_1 wouldn't unwrap the outermost { } around
>>>>> a STRING_CST.
>>>>
>>>>> Perhaps reshape_init should be adjusted to do that unwrapping even when it gets
>>>>> a CONSTRUCTOR_IS_PAREN_INIT CONSTRUCTOR.  But I'm not sure if it should also do
>>>>> the reference_related_p unwrapping in reshape_init_r in that case.
>>>>
>>>> That would make sense to me.
>>>
>>> Done (but only for the outermost CONSTRUCTOR) in the below.  It allowed me to...
>>>
>>>>>>> @@ -3977,9 +3972,15 @@ build_new (location_t loc, vec<tree, va_gc> **placement, tree type,
>>>>>>>      	    }
>>>>>>>      	}
>>>>>>>            /* Otherwise we should have 'new T[]{e_0, ..., e_k}'.  */
>>>>>>> -      if (BRACE_ENCLOSED_INITIALIZER_P (elt))
>>>>>>> -	elt = reshape_init (type, elt, complain);
>>>>>>> -      cp_complete_array_type (&type, elt, /*do_default*/false);
>>>>>>> +      if (deduce_array_p)
>>>>>>> +	{
>>>>>>> +	  /* Don't reshape ELT itself: we want to pass a list-initializer to
>>>>>>> +	     build_new_1, even for STRING_CSTs.  */
>>>>>>> +	  tree e = elt;
>>>>>>> +	  if (BRACE_ENCLOSED_INITIALIZER_P (e))
>>>>>>> +	    e = reshape_init (type, e, complain);
>>>>>>
>>>>>> The comment is unclear; this call does reshape the CONSTRUCTOR ELT points
>>>>>> to, it just doesn't change ELT if the reshape call returns something else.
>>>>>
>>>>> Yea, I've amended the comment.
>>>>>
>>>>>> Why are we reshaping here, anyway?  Won't that lead to undesired brace
>>>>>> elision?
>>>>>
>>>>> We have to reshape before deducing the array, otherwise we could deduce the
>>>>> wrong number of elements when certain braces were omitted.  E.g. in
>>>>>
>>>>>      struct S { int x, y; };
>>>>>      new S[]{1, 2, 3, 4}; // braces elided, is { {1, 2}, {3, 4} }
>>>>
>>>> Ah, right, we also get here for initializers written with actual braces.
>>>>
>>>>> we want S[2], not S[4].  A way to test it would be
>>>>>
>>>>>      struct S { int x, y; };
>>>>>      S *p = new S[]{1, 2, 3, 4};
>>>>>
>>>>>      void* operator new (unsigned long int size)
>>>>>      {
>>>>>          if (size != sizeof (S) * 2)
>>>>> 	__builtin_abort ();
>>>>>          return __builtin_malloc (size);
>>>>>      }
>>>>>
>>>>>      int main () { }
>>>>>
>>>>> I can add that too, if you want.  (It'd be safer if cp_complete_array_type
>>>>> always reshaped but that's not trivial, as the original patch mentions.)
>>>>> ()-init-list wouldn't be reshaped because CONSTRUCTOR_IS_PAREN_INIT is set.
>>>>>
>>>>> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -- >8 --
>>>>> This patch corrects our handling of array new-expression with ()-init:
>>>>>
>>>>>      new int[4](1, 2, 3, 4);
>>>>>
>>>>> should work even with the explicit array bound, and
>>>>>
>>>>>      new char[3]("so_sad");
>>>>>
>>>>> should cause an error, but we weren't giving any.
>>>>>
>>>>> Fixed by handling array new-expressions with ()-init in the same spot
>>>>> where we deduce the array bound in array new-expression.  I'm now
>>>>> always passing STRING_CSTs to build_new_1 wrapped in { } which allowed
>>>>> me to remove the special handling of STRING_CSTs in build_new_1.  And
>>>>> since the DIRECT_LIST_INIT_P block in build_new_1 calls digest_init, we
>>>>> report errors about too short arrays.
>>>>>
>>>>> I took a stab at cp_complete_array_type's "FIXME: this code is duplicated
>>>>> from reshape_init", but calling reshape_init there, I ran into issues
>>>>> with has_designator_problem: when we reshape an already reshaped
>>>>> CONSTRUCTOR again, d.cur.index has been filled, so we think that we
>>>>> have a user-provided designator (though there was no designator in the
>>>>> source code), and report an error.
>>>>>
>>>>> gcc/cp/ChangeLog:
>>>>>
>>>>> 	PR c++/77841
>>>>> 	* init.c (build_new_1): Don't handle string-initializers here.
>>>>> 	(build_new): Handle new-expression with paren-init when the
>>>>> 	array bound is known.  Always pass string constants to build_new_1
>>>>> 	enclosed in braces.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 	PR c++/77841
>>>>> 	* g++.old-deja/g++.ext/arrnew2.C: Expect the error only in C++17
>>>>> 	and less.
>>>>> 	* g++.old-deja/g++.robertl/eb58.C: Adjust dg-error.
>>>>> 	* g++.old-deja/g++.robertl/eb63.C: Expect the error only in C++17
>>>>> 	and less.
>>>>> 	* g++.dg/cpp2a/paren-init36.C: New test.
>>>>> 	* g++.dg/cpp2a/paren-init37.C: New test.
>>>>> 	* g++.dg/pr84729.C: Adjust dg-error.
>>>>> ---
>>>>>     gcc/cp/init.c                                 | 41 +++++++++++--------
>>>>>     gcc/testsuite/g++.dg/cpp2a/paren-init36.C     | 14 +++++++
>>>>>     gcc/testsuite/g++.dg/cpp2a/paren-init37.C     | 14 +++++++
>>>>>     gcc/testsuite/g++.dg/pr84729.C                |  2 +-
>>>>>     gcc/testsuite/g++.old-deja/g++.ext/arrnew2.C  |  2 +-
>>>>>     gcc/testsuite/g++.old-deja/g++.robertl/eb58.C |  2 +-
>>>>>     gcc/testsuite/g++.old-deja/g++.robertl/eb63.C |  2 +-
>>>>>     7 files changed, 55 insertions(+), 22 deletions(-)
>>>>>     create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init36.C
>>>>>     create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init37.C
>>>>>
>>>>> diff --git a/gcc/cp/init.c b/gcc/cp/init.c
>>>>> index 3268ae4ad3f..fe4d49df402 100644
>>>>> --- a/gcc/cp/init.c
>>>>> +++ b/gcc/cp/init.c
>>>>> @@ -3596,15 +3596,6 @@ build_new_1 (vec<tree, va_gc> **placement, tree type, tree nelts,
>>>>>     		  vecinit = digest_init (arraytype, vecinit, complain);
>>>>>     		}
>>>>>     	    }
>>>>> -	  /* This handles code like new char[]{"foo"}.  */
>>>>> -	  else if (len == 1
>>>>> -		   && char_type_p (TYPE_MAIN_VARIANT (type))
>>>>> -		   && TREE_CODE (tree_strip_any_location_wrapper ((**init)[0]))
>>>>> -		      == STRING_CST)
>>>>> -	    {
>>>>> -	      vecinit = (**init)[0];
>>>>> -	      STRIP_ANY_LOCATION_WRAPPER (vecinit);
>>>>> -	    }
>>>>>     	  else if (*init)
>>>>>                 {
>>>>>                   if (complain & tf_error)
>>>>> @@ -3944,9 +3935,8 @@ build_new (location_t loc, vec<tree, va_gc> **placement, tree type,
>>>>>         }
>>>>>       /* P1009: Array size deduction in new-expressions.  */
>>>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>>>> -      && !TYPE_DOMAIN (type)
>>>>> -      && *init)
>>>>> +  const bool array_p = TREE_CODE (type) == ARRAY_TYPE;
>>>>> +  if (*init && (array_p || (nelts && cxx_dialect >= cxx20)))
>>>>>         {
>>>>>           /* This means we have 'new T[]()'.  */
>>>>>           if ((*init)->is_empty ())
>>>>> @@ -3959,12 +3949,16 @@ build_new (location_t loc, vec<tree, va_gc> **placement, tree type,
>>>>>           /* The C++20 'new T[](e_0, ..., e_k)' case allowed by P0960.  */
>>>>>           if (!DIRECT_LIST_INIT_P (elt) && cxx_dialect >= cxx20)
>>>>>     	{
>>>>> -	  /* Handle new char[]("foo").  */
>>>>> +	  tree elt_type = array_p ? TREE_TYPE (type) : type;
>>>>
>>>> I think this should condition on whether nelts is set, to handle e.g. new
>>>> char[2][2] properly.
>>>
>>> ...remove this code.
>>>
>>> I've added new-array5.C to test multidimensional arrays too.
>>>
>>> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>>>
>>> -- >8 --
>>> This patch corrects our handling of array new-expression with ()-init:
>>>
>>>     new int[4](1, 2, 3, 4);
>>>
>>> should work even with the explicit array bound, and
>>>
>>>     new char[3]("so_sad");
>>>
>>> should cause an error, but we weren't giving any.
>>>
>>> Fixed by handling array new-expressions with ()-init in the same spot
>>> where we deduce the array bound in array new-expression.  I'm now
>>> always passing STRING_CSTs to build_new_1 wrapped in { } which allowed
>>> me to remove the special handling of STRING_CSTs in build_new_1.  And
>>> since the DIRECT_LIST_INIT_P block in build_new_1 calls digest_init, we
>>> report errors about too short arrays. reshape_init now does the {"foo"}
>>> -> "foo" transformation even for CONSTRUCTOR_IS_PAREN_INIT, so no need
>>> to do it in build_new.
>>>
>>> I took a stab at cp_complete_array_type's "FIXME: this code is duplicated
>>> from reshape_init", but calling reshape_init there, I ran into issues
>>> with has_designator_problem: when we reshape an already reshaped
>>> CONSTRUCTOR again, d.cur.index has been filled, so we think that we
>>> have a user-provided designator (though there was no designator in the
>>> source code), and report an error.
>>>
>>> gcc/cp/ChangeLog:
>>>
>>> 	PR c++/77841
>>> 	* decl.c (reshape_init): If we're initializing a char array from
>>> 	a string-literal that is enclosed in braces, unwrap it.
>>> 	* init.c (build_new_1): Don't handle string-initializers here.
>>> 	(build_new): Handle new-expression with paren-init when the
>>> 	array bound is known.  Always pass string constants to build_new_1
>>> 	enclosed in braces.  Don't handle string-initializers in any
>>> 	special way.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 	PR c++/77841
>>> 	* g++.old-deja/g++.ext/arrnew2.C: Expect the error only in C++17
>>> 	and less.
>>> 	* g++.old-deja/g++.robertl/eb58.C: Adjust dg-error.
>>> 	* g++.old-deja/g++.robertl/eb63.C: Expect the error only in C++17
>>> 	and less.
>>> 	* g++.dg/cpp2a/new-array5.C: New test.
>>> 	* g++.dg/cpp2a/paren-init36.C: New test.
>>> 	* g++.dg/cpp2a/paren-init37.C: New test.
>>> 	* g++.dg/pr84729.C: Adjust dg-error.
>>> ---
>>>    gcc/cp/decl.c                                 | 12 ++++-
>>>    gcc/cp/init.c                                 | 54 ++++++++-----------
>>>    gcc/testsuite/g++.dg/cpp2a/new-array5.C       | 15 ++++++
>>>    gcc/testsuite/g++.dg/cpp2a/paren-init36.C     | 14 +++++
>>>    gcc/testsuite/g++.dg/cpp2a/paren-init37.C     | 14 +++++
>>>    gcc/testsuite/g++.dg/pr84729.C                |  2 +-
>>>    gcc/testsuite/g++.old-deja/g++.ext/arrnew2.C  |  2 +-
>>>    gcc/testsuite/g++.old-deja/g++.robertl/eb58.C |  2 +-
>>>    gcc/testsuite/g++.old-deja/g++.robertl/eb63.C |  2 +-
>>>    9 files changed, 81 insertions(+), 36 deletions(-)
>>>    create mode 100644 gcc/testsuite/g++.dg/cpp2a/new-array5.C
>>>    create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init36.C
>>>    create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init37.C
>>>
>>> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
>>> index 31d68745844..1287ce1efd1 100644
>>> --- a/gcc/cp/decl.c
>>> +++ b/gcc/cp/decl.c
>>> @@ -6599,7 +6599,17 @@ reshape_init (tree type, tree init, tsubst_flags_t complain)
>>>      /* Brace elision is not performed for a CONSTRUCTOR representing
>>>         parenthesized aggregate initialization.  */
>>>      if (CONSTRUCTOR_IS_PAREN_INIT (init))
>>> -    return init;
>>> +    {
>>> +      tree elt = (*v)[0].value;
>>> +      /* If we're initializing a char array from a string-literal that is
>>> +	 enclosed in braces, unwrap it here.  */
>>> +      if (TREE_CODE (type) == ARRAY_TYPE
>>> +	  && vec_safe_length (v) == 1
>>> +	  && char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (type)))
>>> +	  && TREE_CODE (tree_strip_any_location_wrapper (elt)) == STRING_CST)
>>> +	return elt;
>>> +      return init;
>>
>> You decided against unwrapping a reference_related_p initializer here?
> 
> Yeah, it doesn't seem to be needed: e.g. in
> 
>    struct X { int i; };
>    struct Y : X { };
>    int main ()
>    {
>      Y y;
>      y.i = 42;
>      X x2(y);
>      __builtin_printf ("%d\n", x2.i);
>    }
> 
> we don't reshape_init at all, and build_aggr_init creates an INIT_EXPR
>    x2 = y.D.2087
> which is the same code we generate with e.g. GCC 8, or if we use x2{y}.
> 
> Also, P0960 says that any existing meaning of A(b) should not change.

The patch is OK.

Jason



More information about the Gcc-patches mailing list