[PATCH] libstdc++: Ensure active union member is correctly set

Nathaniel Shead nathanieloshead@gmail.com
Fri Sep 29 16:29:06 GMT 2023


On Fri, Sep 29, 2023 at 04:06:33PM +0100, Jonathan Wakely wrote:
> On Fri, 29 Sept 2023 at 10:32, Jonathan Wakely <jwakely@redhat.com> wrote:
> > > Thanks for the comments, here's an updated version of the patch.
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu.
> >
> > Great, I'll get this committed today - thanks!
> 
> That's done now.
> 

Thanks!

> > >
> > > I'll note that there are some existing calls to `_M_use_local_data()`
> > > already used only for their side effects without a cast to void, e.g.
> > >
> > >       /**
> > >        *  @brief  Default constructor creates an empty string.
> > >        */
> > >       _GLIBCXX20_CONSTEXPR
> > >       basic_string()
> > >       _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
> > >       : _M_dataplus(_M_local_data())
> > >       {
> > >         _M_use_local_data();
> > >         _M_set_length(0);
> > >       }
> > >
> > > I haven't updated these, but should this be changed for consistency?
> >
> > Yes, good idea. I can do that.
> 
> I started to do that, and decided it made more sense to split out the
> constexpr loop from _M_use_local_data() into a separate function,
> _M_init_local_buf(). Then we can use that for all the places where we
> don't care about the return value. That avoids the overhead of using
> pointer_traits::pointer_to when we don't need the return value (which
> is admittedly only going to be an issue for -O0 code, but I think it's
> cleaner this way anyway).
> 
> Please see the attached patch and let me know what you think.

I agree, and it also looks clearer to me what is happening.

> 
> > Thanks again for fixing these. I think this might fix some bug reports
> > about clang rejecting our std::string in constant expressions, so I'll
> > check those.
> 
> Your patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110900
> (so we should backport it to gcc-13 and gcc-12 too).

> commit 2668979d3206ff6c039ac0165aae29377a15666c
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date:   Fri Sep 29 12:12:22 2023
> 
>     libstdc++: Split std::basic_string::_M_use_local_data into two functions
>     
>     This splits out the activate-the-union-member-for-constexpr logic from
>     _M_use_local_data, so that it can be used separately in cases that don't
>     need to use std::pointer_traits<pointer>::pointer_to to obtain the
>     return value.
>     
>     This leaves only three uses of _M_use_local_data() which are all the
>     same form:
>     
>       __s._M_data(_M_use_local_data());
>       __s._M_set_length(0);
>     
>     We could remove _M_use_local_data() and change those three places to use
>     a new _M_reset() function that does:
>     
>       _M_init_local_buf();
>       _M_data(_M_local_data());
>       _M_set_length(0);
>     
>     This is left for a future change.
>     
>     libstdc++-v3/ChangeLog:
>     
>             * include/bits/basic_string.h (_M_init_local_buf()): New
>             function.
>             (_M_use_local_data()): Use _M_init_local_buf.
>             (basic_string(), basic_string(const Alloc&))
>             (basic_string(basic_string&&))
>             (basic_string(basic_string&&, const Alloc&)): Use
>             _M_init_local_buf instead of _M_use_local_data().
>             * include/bits/basic_string.tcc (swap(basic_string&))
>             (_M_construct(InIter, InIter, forward_iterator_tag))
>             (_M_construct(size_type, CharT), reserve()): Likewise.
>             (_M_construct(InIter, InIter, input_iterator_tag)): Remove call
>             to _M_use_local_data() and initialize the local buffer directly.
> 
> diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
> index 4f94cd967cf..18a19b8dcbc 100644
> --- a/libstdc++-v3/include/bits/basic_string.h
> +++ b/libstdc++-v3/include/bits/basic_string.h
> @@ -353,13 +353,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>        // Ensure that _M_local_buf is the active member of the union.
>        __attribute__((__always_inline__))
>        _GLIBCXX14_CONSTEXPR
> -      pointer
> -      _M_use_local_data() _GLIBCXX_NOEXCEPT
> +      void
> +      _M_init_local_buf() _GLIBCXX_NOEXCEPT
>        {
>  #if __cpp_lib_is_constant_evaluated
>  	if (std::is_constant_evaluated())
>  	  for (size_type __i = 0; __i <= _S_local_capacity; ++__i)
>  	    _M_local_buf[__i] = _CharT();
> +#endif
> +      }
> +
> +      __attribute__((__always_inline__))
> +      _GLIBCXX14_CONSTEXPR
> +      pointer
> +      _M_use_local_data() _GLIBCXX_NOEXCEPT
> +      {
> +#if __glibcxx_is_constant_evaluated
> +	_M_init_local_buf();
>  #endif
>  	return _M_local_data();
>        }

What's the difference between __cpp_lib_is_constant_evaluated and
__glibcxx_is_constant_evaluated? Should these lines be using the same
macro here?

> @@ -522,7 +532,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>        _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
>        : _M_dataplus(_M_local_data())
>        {
> -	_M_use_local_data();
> +	_M_init_local_buf();
>  	_M_set_length(0);
>        }
>  
> @@ -534,7 +544,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>        basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
>        : _M_dataplus(_M_local_data(), __a)
>        {
> -	_M_use_local_data();
> +	_M_init_local_buf();
>  	_M_set_length(0);
>        }
>  
> @@ -678,7 +688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>        {
>  	if (__str._M_is_local())
>  	  {
> -	    (void)_M_use_local_data();
> +	    _M_init_local_buf();
>  	    traits_type::copy(_M_local_buf, __str._M_local_buf,
>  			      __str.length() + 1);
>  	  }
> @@ -718,7 +728,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>        {
>  	if (__str._M_is_local())
>  	  {
> -	    (void)_M_use_local_data();
> +	    _M_init_local_buf();
>  	    traits_type::copy(_M_local_buf, __str._M_local_buf,
>  			      __str.length() + 1);
>  	    _M_length(__str.length());
> diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
> index 4bc98f2aea7..052eeb9e846 100644
> --- a/libstdc++-v3/include/bits/basic_string.tcc
> +++ b/libstdc++-v3/include/bits/basic_string.tcc
> @@ -79,7 +79,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	      }
>  	    else if (__s.length())
>  	      {
> -		(void)_M_use_local_data();
> +		_M_init_local_buf();
>  		traits_type::copy(_M_local_buf, __s._M_local_buf,
>  				  __s.length() + 1);
>  		_M_length(__s.length());
> @@ -88,7 +88,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	      }
>  	    else if (length())
>  	      {
> -		(void)__s._M_use_local_data();
> +		__s._M_init_local_buf();
>  		traits_type::copy(__s._M_local_buf, _M_local_buf,
>  				  length() + 1);
>  		__s._M_length(length());
> @@ -99,7 +99,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	else
>  	  {
>  	    const size_type __tmp_capacity = __s._M_allocated_capacity;
> -	    (void)__s._M_use_local_data();
> +	    __s._M_init_local_buf();
>  	    traits_type::copy(__s._M_local_buf, _M_local_buf,
>  			      length() + 1);
>  	    _M_data(__s._M_data());
> @@ -111,7 +111,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	  const size_type __tmp_capacity = _M_allocated_capacity;
>  	  if (__s._M_is_local())
>  	    {
> -	      (void)_M_use_local_data();
> +	      _M_init_local_buf();
>  	      traits_type::copy(_M_local_buf, __s._M_local_buf,
>  				__s.length() + 1);
>  	      __s._M_data(_M_data());
> @@ -174,14 +174,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	size_type __len = 0;
>  	size_type __capacity = size_type(_S_local_capacity);
>  
> -	pointer __p = _M_use_local_data();
> -
>  	while (__beg != __end && __len < __capacity)
>  	  {
> -	    __p[__len++] = *__beg;
> +	    _M_local_buf[__len++] = *__beg;
>  	    ++__beg;
>  	  }
>  
> +#if __glibcxx_is_constant_evaluated
> +	if (std::is_constant_evaluated())
> +	  for (size_type __i = __len; __i <= __capacity; ++__i)
> +	    _M_local_buf[__i] = _CharT();
> +#endif
> +

I wonder if maybe this should still be a call to `_M_init_local_buf()`
above, where the `_M_use_local_data()` used to be? That way the logic
stays in one place, and I don't imagine the compile time savings of not
immediately overwriting the first __len characters would be significant.

>  	struct _Guard
>  	{
>  	  _GLIBCXX20_CONSTEXPR
> @@ -230,7 +234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	    _M_capacity(__dnew);
>  	  }
>  	else
> -	  _M_use_local_data();
> +	  _M_init_local_buf();
>  
>  	// Check for out_of_range and length_error exceptions.
>  	struct _Guard
> @@ -263,7 +267,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  	  _M_capacity(__n);
>  	}
>        else
> -	_M_use_local_data();
> +	_M_init_local_buf();
>  
>        if (__n)
>  	this->_S_assign(_M_data(), __n, __c);
> @@ -372,7 +376,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>        if (__length <= size_type(_S_local_capacity))
>  	{
> -	  this->_S_copy(_M_use_local_data(), _M_data(), __length + 1);
> +	  _M_init_local_buf();
> +	  this->_S_copy(_M_local_buf, _M_data(), __length + 1);
>  	  _M_destroy(__capacity);
>  	  _M_data(_M_local_data());
>  	}

But looks good to me; I tried applying this on top of my frontend patch
and it looks like everything still works correctly.



More information about the Gcc-patches mailing list