Bug 93059 - char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::uninitialized_fill_n fails to convert to memxxx functions
Summary: char and char8_t does not talk with each other with memcpy. std::copy std::co...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: 15.0
Assignee: Jonathan Wakely
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2019-12-23 22:10 UTC by fdlbxtqi
Modified: 2025-11-30 06:15 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-02-26 00:00:00


Attachments
An untested patch (3.32 KB, patch)
2019-12-29 14:53 UTC, fdlbxtqi
Details | Diff
forgot to_address (482 bytes, patch)
2019-12-29 15:08 UTC, fdlbxtqi
Details | Diff
Testsuite (244.93 KB, application/x-7z-compressed)
2019-12-30 19:12 UTC, fdlbxtqi
Details
Here is my stl_algobase.h after patch. You can try it directly. (11.11 KB, text/plain)
2019-12-30 19:23 UTC, fdlbxtqi
Details
copy_backward bug fixed for the last patch (3.23 KB, patch)
2019-12-31 06:03 UTC, fdlbxtqi
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description fdlbxtqi 2019-12-23 22:10:53 UTC
https://godbolt.org/z/SPktTz

All these functions should generate exactly the same assembly but they do not. GCC does not treat char and char8_t the same because libstdc++ does not do this check. I did my native manually fix and it works. (does not do mul_overflow something)

//g++ -S copy.cc -Ofast -std=c++2a
#include<cstring>
#include<algorithm>
#include<array>
#include<concepts>
#include<iterator>

auto copy_char8_t_array(char* out,std::array<char8_t,2> const& bits)
{
	return std::copy_n(bits.data(),bits.size(),out);
}


auto memcpy_char8_t_array(char* out,std::array<char8_t,2> const& bits)
{
	std::memcpy(out,bits.data(),bits.size());
	return bits.size();
}


auto copy_char_array(char* out,std::array<char,2> const& bits)
{
	return std::copy_n(bits.data(),bits.size(),out);
}


auto memcpy_char_array(char* out,std::array<char,2> const& bits)
{
	std::memcpy(out,bits.data(),bits.size());
	return bits.size();
}

auto copy_char_array_chars(char8_t* out,std::array<char8_t,2> const& bits)
{
	return std::copy_n(bits.data(),bits.size(),out);
}


auto memcpy_char_array_array_chars(char8_t* out,std::array<char8_t,2> const& bits)
{
	std::memcpy(out,bits.data(),bits.size());
	return bits.size();
}


auto copy_char_array(char8_t* out,std::array<char,2> const& bits)
{
	return std::copy_n(bits.data(),bits.size(),out);
}


auto memcpy_char_array(char8_t* out,std::array<char,2> const& bits)
{
	std::memcpy(out,bits.data(),bits.size());
	return bits.size();
}
template<std::input_iterator input_iter,std::input_iterator output_iter>
inline constexpr output_iter my_copy_n(input_iter first,std::size_t count,output_iter result)
{
	if constexpr(std::contiguous_iterator<input_iter>&&
		std::contiguous_iterator<output_iter>&&
		std::is_trivially_copyable_v<typename std::iterator_traits<input_iter>::value_type>&&
		std::is_trivially_copyable_v<typename std::iterator_traits<output_iter>::value_type>)
	{
		if constexpr(sizeof(std::is_trivially_copyable_v<typename std::iterator_traits<input_iter>::value_type>)
		==sizeof(std::is_trivially_copyable_v<typename std::iterator_traits<output_iter>::value_type>))
		{
			memcpy(std::to_address(result),std::to_address(first),
				sizeof(typename std::iterator_traits<input_iter>::value_type)*count);
			return result+count;
		}
	}
	return std::copy_n(first,count,result);
}

auto my_copy_char_array(char8_t* out,std::array<char,2> const& bits)
{
	return my_copy_n(bits.data(),bits.size(),out);
}
auto my_copy_char_array(char* out,std::array<char8_t,2> const& bits)
{
	return my_copy_n(bits.data(),bits.size(),out);
}


auto uninit_copy_char_array(char8_t* out,std::array<char,2> const& bits)
{
	return std::uninitialized_copy_n(bits.data(),bits.size(),out);
}




_Z29memcpy_char_array_array_charsPDuRKSt5arrayIDuLm2EE:
        movzwl  (%rsi), %eax
        movw    %ax, (%rdi)
        movl    $2, %eax
        ret
std::copy_n generates more assembly than it should
_Z15copy_char_arrayPDuRKSt5arrayIcLm2EE:
        movzbl  (%rsi), %eax
        movb    %al, (%rdi)
        movzbl  1(%rsi), %eax
        movb    %al, 1(%rdi)
        leaq    2(%rdi), %rax
        ret
Comment 1 fdlbxtqi 2019-12-24 00:28:37 UTC
I am going to rewrite these functions by C++20 concepts + if constexpr for C++20 for you, Jwakely. I do not believe these enable-if/ overloading functions would not be a problem.
Comment 2 fdlbxtqi 2019-12-24 03:05:30 UTC
Also find a bug of __memmove

  /*
   * A constexpr wrapper for __builtin_memmove.
   * @param __num The number of elements of type _Tp (not bytes).
   */
  template<bool _IsMove, typename _Tp>
    _GLIBCXX14_CONSTEXPR
    inline void*
    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
    {
#ifdef __cpp_lib_is_constant_evaluated
      if (std::is_constant_evaluated())
	{
	  for(; __num > 0; --__num)
	    {
	      if constexpr (_IsMove)
		*__dst = std::move(*__src);
	      else
		*__dst = *__src;
	      ++__src;
	      ++__dst;
	    }
	  return __dst;
	}
      else
#endif
	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
      return __dst;
    }

The last 2nd line return __dst is wrong. It should not exist.
Comment 3 fdlbxtqi 2019-12-24 04:13:04 UTC
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h

I have found out the problem.

1. libstdc++ does not use memmove for different trivially copyable objects. It only uses it for the same type which is clearly incorrect.

The performance lost is HUGE in my benchmarks for losing at least 50% of performance in some critical path.

https://godbolt.org/z/ouNiHn

For vector, the performance lost is even worse. It generates 96 lines of assembly for different types since it does not call memmove.

2. memmove should not be used for different types unless the source and dest have char*, char const* and std::byte* for trivially copyable types. It should call memcpy because of the strict-aliasing rule. However, I do not know whether it is possible to detect the strict-aliasing context for libstdc++ in GCC. It should add some magic here to make it faster.
Comment 4 fdlbxtqi 2019-12-24 04:53:24 UTC
A demo fix would be like this i think:

template<std::input_iterator input_iter,std::input_iterator output_iter>
inline constexpr output_iter my_copy_n(input_iter first,std::size_t count,output_iter result)
{
    using input_value_type = typename std::iterator_traits<input_iter>::value_type;
    using output_value_type = typename std::iterator_traits<output_iter>::value_type;
	if constexpr
	(std::contiguous_iterator<input_iter>&&
	std::contiguous_iterator<output_iter>&&
	std::is_trivially_copyable_v<input_value_type>&&
	std::is_trivially_copyable_v<output_value_type>&&
	(std::same_as<input_value_type,output_value_type>||(std::integral<input_value_type>&&
	std::integral<output_value_type>&&sizeof(std::is_trivially_copyable_v<input_value_type>)==sizeof(std::is_trivially_copyable_v<output_value_type>))))
	{
        if (std::is_constant_evaluated())
        {
            for(;count--;)
            {
                *result=*first;
                ++first;
                ++result;
            }
        }
        else
            __builtin_memmove(std::to_address(result),std::to_address(first),
                sizeof(typename std::iterator_traits<input_iter>::value_type)*count);
		return result+count;
	}
	return std::copy_n(first,count,result);
}
Comment 5 Marc Glisse 2019-12-24 09:54:12 UTC
We could indeed relax a bit the "same type" condition. We could also make sure that __restrict appears somewhere in the call chain when using copy or uninitialized_*, which lets the compiler merge the 2 loads/stores.

(In reply to fdlbxtqi from comment #2)
> The last 2nd line return __dst is wrong. It should not exist.

Write a patch, test it, and send it to the mailing list?

(In reply to fdlbxtqi from comment #3)
> For vector, the performance lost is even worse. It generates 96 lines of
> assembly for different types since it does not call memmove.

What operation are you doing on vector? None of your testcases seem to use it.

> clearly incorrect

Please distinguish between what is wrong (generated code crashes, or returns 3 instead of 2), and what is suboptimal.
Comment 6 fdlbxtqi 2019-12-24 09:57:01 UTC
> What operation are you doing on vector? None of your testcases seem to use
> it.


void copy_char_vector_with_iter(std::vector<char8_t>::iterator out,std::vector<char> const& bits)
{
	std::copy_n(bits.begin(),bits.size(),out);
}

https://godbolt.org/z/_yA_Ls
See the assembly

> > clearly incorrect
> 
> Please distinguish between what is wrong (generated code crashes, or returns
> 3 instead of 2), and what is suboptimal.

Suppose #ifdef __cpp_lib_is_constant_evaluated is not defined (for C++17)

It becomes:
        return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
      return __dst;

?????
Comment 7 Marc Glisse 2019-12-24 10:01:36 UTC
(In reply to fdlbxtqi from comment #6)
> > > clearly incorrect
> > 
> > Please distinguish between what is wrong (generated code crashes, or returns
> > 3 instead of 2), and what is suboptimal.
> 
> Suppose #ifdef __cpp_lib_is_constant_evaluated is not defined (for C++17)
> 
> It becomes:
>         return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>       return __dst;
> 
> ?????

I should have kept the whole quote:

> 1. libstdc++ does not use memmove for different trivially copyable objects. It only uses it for the same type which is clearly incorrect.
Comment 8 Marc Glisse 2019-12-24 10:40:04 UTC
(In reply to fdlbxtqi from comment #6)
> void copy_char_vector_with_iter(std::vector<char8_t>::iterator
> out,std::vector<char> const& bits)
> {
> 	std::copy_n(bits.begin(),bits.size(),out);
> }
> 
> https://godbolt.org/z/_yA_Ls
> See the assembly

Thanks, I just filed PR 93063 about a subpart that the compiler should be able to do without help from the library.
Comment 9 fdlbxtqi 2019-12-24 23:00:57 UTC
(In reply to Marc Glisse from comment #8)
> (In reply to fdlbxtqi from comment #6)
> > void copy_char_vector_with_iter(std::vector<char8_t>::iterator
> > out,std::vector<char> const& bits)
> > {
> > 	std::copy_n(bits.begin(),bits.size(),out);
> > }
> > 
> > https://godbolt.org/z/_yA_Ls
> > See the assembly
> 
> Thanks, I just filed PR 93063 about a subpart that the compiler should be
> able to do without help from the library.

I have tried the same benchmark on clang. It has the same issue. However, GCC's optimization of this would inspire Clang I believe.
Comment 10 fdlbxtqi 2019-12-24 23:01:42 UTC
(In reply to fdlbxtqi from comment #9)
> (In reply to Marc Glisse from comment #8)
> > (In reply to fdlbxtqi from comment #6)
> > > void copy_char_vector_with_iter(std::vector<char8_t>::iterator
> > > out,std::vector<char> const& bits)
> > > {
> > > 	std::copy_n(bits.begin(),bits.size(),out);
> > > }
> > > 
> > > https://godbolt.org/z/_yA_Ls
> > > See the assembly
> > 
> > Thanks, I just filed PR 93063 about a subpart that the compiler should be
> > able to do without help from the library.
> 
> I have tried the same benchmark on clang. It has the same issue. However,
> GCC's optimization of this would inspire Clang I believe.

and libc++
Comment 11 fdlbxtqi 2019-12-29 03:32:56 UTC
(In reply to Marc Glisse from comment #8)
> (In reply to fdlbxtqi from comment #6)
> > void copy_char_vector_with_iter(std::vector<char8_t>::iterator
> > out,std::vector<char> const& bits)
> > {
> > 	std::copy_n(bits.begin(),bits.size(),out);
> > }
> > 
> > https://godbolt.org/z/_yA_Ls
> > See the assembly
> 
> Thanks, I just filed PR 93063 about a subpart that the compiler should be
> able to do without help from the library.

TBH. I would rather see the library does the optimization instead of the compiler. I do not trust the compiler can always optimize this stuff.
Comment 12 fdlbxtqi 2019-12-29 04:37:13 UTC
(In reply to Marc Glisse from comment #8)
> (In reply to fdlbxtqi from comment #6)
> > void copy_char_vector_with_iter(std::vector<char8_t>::iterator
> > out,std::vector<char> const& bits)
> > {
> > 	std::copy_n(bits.begin(),bits.size(),out);
> > }
> > 
> > https://godbolt.org/z/_yA_Ls
> > See the assembly
> 
> Thanks, I just filed PR 93063 about a subpart that the compiler should be
> able to do without help from the library.

I think the implementation is still problematic since C++ 20 adds the contiguous_iterator concept. The current implementation looks not to allow custom 
 contiguous iterators types to be optimized as memmove.

It needs certain kinds of revamping IMO.
Comment 13 Marc Glisse 2019-12-29 08:22:16 UTC
(In reply to fdlbxtqi from comment #11)
> TBH. I would rather see the library does the optimization instead of the
> compiler. I do not trust the compiler can always optimize this stuff.

If we have both, that looks even safer ;-)
Comment 14 fdlbxtqi 2019-12-29 08:30:23 UTC
I think It is worth the effort to rewrite these functions since they are so fundamental to the performance of entire C++. What I am worry about is that whether revamping these functions would be a new ABI breaking.

If it won’t, I would like to contribute my implementation.

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: glisse at gcc dot gnu.org<mailto:gcc-bugzilla@gcc.gnu.org>
Sent: Sunday, December 29, 2019 03:22
To: euloanty@live.com<mailto:euloanty@live.com>
Subject: [Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::uninitialized_fill_n fails to convert to memxxx functions

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059

--- Comment #13 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to fdlbxtqi from comment #11)
> TBH. I would rather see the library does the optimization instead of the
> compiler. I do not trust the compiler can always optimize this stuff.

If we have both, that looks even safer ;-)

--
You are receiving this mail because:
You are on the CC list for the bug.
You reported the bug.
Comment 15 fdlbxtqi 2019-12-29 08:33:58 UTC
(In reply to fdlbxtqi from comment #14)
> I think It is worth the effort to rewrite these functions since they are so
> fundamental to the performance of entire C++. What I am worry about is that
> whether revamping these functions would be a new ABI breaking.
> 
> If it won’t, I would like to contribute my implementation.
> 
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> 
> From: glisse at gcc dot gnu.org<mailto:gcc-bugzilla@gcc.gnu.org>
> Sent: Sunday, December 29, 2019 03:22
> To: euloanty@live.com<mailto:euloanty@live.com>
> Subject: [Bug libstdc++/93059] char and char8_t does not talk with each
> other with memcpy. std::copy std::copy_n, std::fill, std::fill_n,
> std::uninitialized_copy std::uninitialized_copy_n, std::fill,
> std::uninitialized_fill_n fails to convert to memxxx functions
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059
> 
> --- Comment #13 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to fdlbxtqi from comment #11)
> > TBH. I would rather see the library does the optimization instead of the
> > compiler. I do not trust the compiler can always optimize this stuff.
> 
> If we have both, that looks even safer ;-)
> 
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
> You reported the bug.

I think It is worth the effort to rewrite these functions since they are so fundamental to the performance of entire C++. Revamping them would create 100% of performance benefits in a lot of important benchmarks. What I am worried about is that whether revamping these functions would be a new wave of ABI breaking.
Comment 16 fdlbxtqi 2019-12-29 08:44:13 UTC
(In reply to Marc Glisse from comment #13)
> (In reply to fdlbxtqi from comment #11)
> > TBH. I would rather see the library does the optimization instead of the
> > compiler. I do not trust the compiler can always optimize this stuff.
> 
> If we have both, that looks even safer ;-)

Concepts are no doubt a huge deal breaker.
Comment 17 Marc Glisse 2019-12-29 10:21:05 UTC
(In reply to fdlbxtqi from comment #15)
> What I am worried about is that whether revamping these functions would be a new wave of ABI breaking.

I don't foresee any ABI issue here. Do make sure your code doesn't break with -std=c++03, and run the testsuite before submitting it.
Comment 18 fdlbxtqi 2019-12-29 12:54:28 UTC
(In reply to Marc Glisse from comment #17)
> (In reply to fdlbxtqi from comment #15)
> > What I am worried about is that whether revamping these functions would be a new wave of ABI breaking.
> 
> I don't foresee any ABI issue here. Do make sure your code doesn't break
> with -std=c++03, and run the testsuite before submitting it.

Also, I have found another bug??? in both libstdc++ and libc++????

memmove is not allowed for using volatile. Only MSVC passes the test.

libstdc++&libc++ are wrong.

https://godbolt.org/z/FoTCbk

I will fix this as well.
Comment 19 fdlbxtqi 2019-12-29 14:53:52 UTC
Created attachment 47559 [details]
An untested patch

From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
From: expnkx <unlvsur@live.com>
Date: Sun, 29 Dec 2019 09:49:19 -0500
Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
 std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
 behavior of &*end constexpr std::fill Greatly improve performance of
 std::copyXXX and std::uninitialized_copyXX for different types and apply
 pipeline optimization to reduce branch misprediction
Comment 20 fdlbxtqi 2019-12-29 15:06:40 UTC
Comment on attachment 47559 [details]
An untested patch

>From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
>From: expnkx <unlvsur@live.com>
>Date: Sun, 29 Dec 2019 09:49:19 -0500
>Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
> std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
> behavior of &*end constexpr std::fill Greatly improve performance of
> std::copyXXX and std::uninitialized_copyXX for different types and apply
> pipeline optimization to reduce branch misprediction
>
>---
> libstdc++-v3/include/bits/stl_algobase.h | 170 ++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 22 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
>index 40d056ae8d5..01672a8f232 100644
>--- a/libstdc++-v3/include/bits/stl_algobase.h
>+++ b/libstdc++-v3/include/bits/stl_algobase.h
>@@ -1,6 +1,6 @@
> // Core algorithmic facilities -*- C++ -*-
> 
>-// Copyright (C) 2001-2019 Free Software Foundation, Inc.
>+// Copyright (C) 2001-2020 Free Software Foundation, Inc.
> //
> // This file is part of the GNU ISO C++ Library.  This library is free
> // software; you can redistribute it and/or modify it under the
>@@ -84,10 +84,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove, typename _Tp>
>+  _GLIBCXX14_CONSTEXPR
>+    inline void volatile*
>+    __memmove(_Tp volatile* __dst, _Tp volatile const* __src, size_t __num)
>+    {
>+	  for(; __num > 0; --__num)
>+	    {
>+#if __cplusplus >= 201103L
>+	      if constexpr (_IsMove)
>+		*__dst = std::move(*__src);
>+	      else
>+#endif
>+		*__dst = *__src;
>+	      ++__src;
>+	      ++__dst;
>+	    }
>+	  return __dst;
>+    }
>+#endif
>+  template<bool _IsMove, typename _Tp,typename _Tp_src>
>+/*
>+#ifdef __cpp_lib_concepts
>+    requires (is_trivially_copyable_v<_Tp>&&!is_volatile_v<_Tp>
>+      &&is_trivially_copyable_v<_Tp_src>&&!is_volatile_v<_Tp_src>
>+      &&(same_as<_Tp,_Tp_src>||(sizeof(_Tp)==sizeof(_Tp_src)&&integral<_Tp>&&integral<_Tp_src>)))
>+#endif
>+*/
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>-    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>+    __memmove(_Tp* __dst, _Tp_src const* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>@@ -106,9 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>-      return __dst;
>     }
>-
>   /*
>    * A constexpr wrapper for __builtin_memcmp.
>    * @param __num The number of elements of type _Tp (not bytes).
>@@ -431,7 +456,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	}
>     };
> #endif
>-
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -448,12 +473,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result, __first, _Num);
>+	  if (!_Num)
>+  	  return __result + _Num;
>+//since before C++20, we do not have [[likely]] attribute. We need to do it manually
>+    std::__memmove<_IsMove>(__result, __first, _Num);
> 	  return __result + _Num;
> 	}
>     };
>-
>+#endif
>   // Helpers for streambuf iterators (either istream or ostream).
>   // NB: avoid including <iosfwd>, relatively large.
>   template<typename _CharT>
>@@ -491,11 +518,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       typedef typename iterator_traits<_II>::value_type _ValueTypeI;
>       typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
>       typedef typename iterator_traits<_II>::iterator_category _Category;
>+#ifdef __cpp_lib_concepts
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueTypeI>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueTypeO>)&&
>+          contiguous_iterator<_II>&&
>+          contiguous_iterator<_OI>&&
>+          (same_as<_ValueTypeI,_ValueTypeO>
>+          ||(integral<_ValueTypeI>&&integral<_ValueTypeO>&&
>+              sizeof(_ValueTypeI)==sizeof(_ValueTypeO))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                 is_move_assignable<_OI>,
>+                is_copy_assignable<_OI>>;
>+        static_assert( __assignable::type::value, "result type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+#ifdef __has_cpp_attribute(likely)
>+//This if statement is by default bad since it affects CPU pipeline.
>+//needs likely attribute or the branch misprediction penalty (100% misprediction rate probably) is HUGE
>+        [[likely]]
>+#endif
>+          std::__memmove<_IsMove>(to_address(__result), to_address(__first), _Num);
>+        return __result + _Num;
>+      }
>+      else
>+      return std::__copy_move<_IsMove, false,
>+#else
>       const bool __simple = (__is_trivially_copyable(_ValueTypeI)
> 			     && __is_pointer<_II>::__value
> 			     && __is_pointer<_OI>::__value
> 			     && __are_same<_ValueTypeI, _ValueTypeO>::__value);
>       return std::__copy_move<_IsMove, __simple,
>+#endif
> 			      _Category>::__copy_m(__first, __last, __result);
>     }
> 
>@@ -581,6 +638,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>    *  Note that the end of the output range is permitted to be contained
>    *  within [first,last).
>   */
>+
>   template<typename _II, typename _OI>
>     _GLIBCXX20_CONSTEXPR
>     inline _OI
>@@ -595,7 +653,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       return std::__copy_move_a<__is_move_iterator<_II>::__value>
> 	     (std::__miter_base(__first), std::__miter_base(__last), __result);
>     }
>-
> #if __cplusplus >= 201103L
>   /**
>    *  @brief Moves the range [first,last) into result.
>@@ -698,6 +755,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     };
> #endif
> 
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -714,11 +772,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
>+	  if (!_Num)
>+      return __result - _Num;
>+    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
> 	  return __result - _Num;
> 	}
>     };
>+#endif
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>     _GLIBCXX20_CONSTEXPR
>@@ -728,21 +788,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       typedef typename iterator_traits<_BI1>::value_type _ValueType1;
>       typedef typename iterator_traits<_BI2>::value_type _ValueType2;
>       typedef typename iterator_traits<_BI1>::iterator_category _Category;
>-      const bool __simple = (__is_trivially_copyable(_ValueType1)
>-			     && __is_pointer<_BI1>::__value
>-			     && __is_pointer<_BI2>::__value
>-			     && __are_same<_ValueType1, _ValueType2>::__value);
>-
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>-	return std::__copy_move_backward<true, false,
>+      	return std::__copy_move_backward<true, false,
> 			      _Category>::__copy_move_b(__first, __last,
> 							__result);
>+      else
>+      {
> #endif
>+#if __cpp_lib_concepts 
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueType1>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueType2>)&&
>+          contiguous_iterator<_BI1>&&
>+          contiguous_iterator<_BI2>&&
>+          (same_as<_ValueType1,_ValueType2>
>+          ||(integral<_ValueType1>&&integral<_ValueType2>&&
>+              sizeof(_ValueType1)==sizeof(_ValueType2))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                is_move_assignable<_ValueType2>,
>+                is_copy_assignable<_ValueType2>>;
>+        // trivial types can have deleted assignment
>+        static_assert( __assignable::type::value, "type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+          #ifdef __has_cpp_attribute(likely)
>+            [[likely]]
>+          #endif
>+          std::__memmove<_IsMove>(to_address(__result) - _Num, to_address(__first), _Num);
>+        return __result - _Num; 
>+      }
>+#else
>+      const bool __simple = (
>+#if __cplusplus >= 201103L
>+        (!is_volatile<typename std::remove_reference<decltype(*__first)>::type>::value)&&
>+#endif
>+        __is_trivially_copyable(_ValueType1)
>+			     && __is_pointer<_BI1>::__value
>+			     && __is_pointer<_BI2>::__value
>+			     && __are_same<_ValueType1, _ValueType2>::__value);
>       return std::__copy_move_backward<_IsMove, __simple,
> 				       _Category>::__copy_move_b(__first,
> 								 __last,
> 								 __result);
>+#endif
>+#ifdef __cpp_lib_is_constant_evaluated
>+      }
>+#endif
>     }
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>@@ -908,16 +1003,33 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     }
> 
>   // Specialization: for char types we can use memset.
>+
>   template<typename _Tp>
>+    _GLIBCXX20_CONSTEXPR
>     inline typename
>     __gnu_cxx::__enable_if<__is_byte<_Tp>::__value, void>::__type
>     __fill_a1(_Tp* __first, _Tp* __last, const _Tp& __c)
>     {
>-      const _Tp __tmp = __c;
>+      _Tp const __tmp = __c;
>+#ifdef __cpp_lib_is_constant_evaluated
>+    if (std::is_constant_evaluated())
>+    {
>+      for (; __first != __last; ++__first)
>+	*__first = __tmp;
>+    }
>+    else
>+    {
>+#endif
>       if (const size_t __len = __last - __first)
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	__builtin_memset(__first, static_cast<unsigned char>(__tmp), __len);
>+#ifdef __cpp_lib_is_constant_evaluated
>     }
>-
>+#endif
>+    }
>+#ifndef __cpp_lib_concepts
>   template<typename _Ite, typename _Cont, typename _Tp>
>     _GLIBCXX20_CONSTEXPR
>     inline void
>@@ -925,7 +1037,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	      ::__gnu_cxx::__normal_iterator<_Ite, _Cont> __last,
> 	      const _Tp& __value)
>     { std::__fill_a1(__first.base(), __last.base(), __value); }
>-
>+#endif
>   template<typename _Tp, typename _VTp>
>     void
>     __fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>&,
>@@ -936,7 +1048,14 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     _GLIBCXX20_CONSTEXPR
>     inline void
>     __fill_a(_FIte __first, _FIte __last, const _Tp& __value)
>-    { std::__fill_a1(__first, __last, __value); }
>+    {
>+#ifdef __cpp_lib_concepts
>+      if constexpr(contiguous_iterator<_FIte>)
>+        std::__fill_a1(to_address(__first), to_address(__last), __value);
>+      else
>+#endif
>+        std::__fill_a1(__first, __last, __value);
>+    }
> 
>   template<typename _Ite, typename _Seq, typename _Cat, typename _Tp>
>     void
>@@ -1306,6 +1425,9 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  const size_t __len1 = __last1 - __first1;
> 	  const size_t __len2 = __last2 - __first2;
> 	  if (const size_t __len = std::min(__len1, __len2))
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	    if (int __result = std::__memcmp(__first1, __first2, __len))
> 	      return __result < 0;
> 	  return __len1 < __len2;
>@@ -1733,7 +1855,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
> 		if (__len)
> 		  {
> 		    const auto __c
>-		      = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
>+		      = __builtin_memcmp(to_address(__first1), to_address(__first2), __len) <=> 0;
>+//&*__first1 is not correct and would crash program. Must use to_address()
>+//since a lot of debugging iterator would not be allowed to dereference __first2.
>+//It is undefined behavior to derefernce sentinal iterators
>+//For example, VC's implementation
> 		    if (__c != 0)
> 		      return __c;
> 		  }
>-- 
>2.24.0.windows.2
>
Comment 21 fdlbxtqi 2019-12-29 15:06:45 UTC
Comment on attachment 47559 [details]
An untested patch

>From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
>From: expnkx <unlvsur@live.com>
>Date: Sun, 29 Dec 2019 09:49:19 -0500
>Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
> std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
> behavior of &*end constexpr std::fill Greatly improve performance of
> std::copyXXX and std::uninitialized_copyXX for different types and apply
> pipeline optimization to reduce branch misprediction
>
>---
> libstdc++-v3/include/bits/stl_algobase.h | 170 ++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 22 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
>index 40d056ae8d5..01672a8f232 100644
>--- a/libstdc++-v3/include/bits/stl_algobase.h
>+++ b/libstdc++-v3/include/bits/stl_algobase.h
>@@ -1,6 +1,6 @@
> // Core algorithmic facilities -*- C++ -*-
> 
>-// Copyright (C) 2001-2019 Free Software Foundation, Inc.
>+// Copyright (C) 2001-2020 Free Software Foundation, Inc.
> //
> // This file is part of the GNU ISO C++ Library.  This library is free
> // software; you can redistribute it and/or modify it under the
>@@ -84,10 +84,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove, typename _Tp>
>+  _GLIBCXX14_CONSTEXPR
>+    inline void volatile*
>+    __memmove(_Tp volatile* __dst, _Tp volatile const* __src, size_t __num)
>+    {
>+	  for(; __num > 0; --__num)
>+	    {
>+#if __cplusplus >= 201103L
>+	      if constexpr (_IsMove)
>+		*__dst = std::move(*__src);
>+	      else
>+#endif
>+		*__dst = *__src;
>+	      ++__src;
>+	      ++__dst;
>+	    }
>+	  return __dst;
>+    }
>+#endif
>+  template<bool _IsMove, typename _Tp,typename _Tp_src>
>+/*
>+#ifdef __cpp_lib_concepts
>+    requires (is_trivially_copyable_v<_Tp>&&!is_volatile_v<_Tp>
>+      &&is_trivially_copyable_v<_Tp_src>&&!is_volatile_v<_Tp_src>
>+      &&(same_as<_Tp,_Tp_src>||(sizeof(_Tp)==sizeof(_Tp_src)&&integral<_Tp>&&integral<_Tp_src>)))
>+#endif
>+*/
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>-    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>+    __memmove(_Tp* __dst, _Tp_src const* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>@@ -106,9 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>-      return __dst;
>     }
>-
>   /*
>    * A constexpr wrapper for __builtin_memcmp.
>    * @param __num The number of elements of type _Tp (not bytes).
>@@ -431,7 +456,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	}
>     };
> #endif
>-
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -448,12 +473,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result, __first, _Num);
>+	  if (!_Num)
>+  	  return __result + _Num;
>+//since before C++20, we do not have [[likely]] attribute. We need to do it manually
>+    std::__memmove<_IsMove>(__result, __first, _Num);
> 	  return __result + _Num;
> 	}
>     };
>-
>+#endif
>   // Helpers for streambuf iterators (either istream or ostream).
>   // NB: avoid including <iosfwd>, relatively large.
>   template<typename _CharT>
>@@ -491,11 +518,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       typedef typename iterator_traits<_II>::value_type _ValueTypeI;
>       typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
>       typedef typename iterator_traits<_II>::iterator_category _Category;
>+#ifdef __cpp_lib_concepts
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueTypeI>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueTypeO>)&&
>+          contiguous_iterator<_II>&&
>+          contiguous_iterator<_OI>&&
>+          (same_as<_ValueTypeI,_ValueTypeO>
>+          ||(integral<_ValueTypeI>&&integral<_ValueTypeO>&&
>+              sizeof(_ValueTypeI)==sizeof(_ValueTypeO))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                 is_move_assignable<_OI>,
>+                is_copy_assignable<_OI>>;
>+        static_assert( __assignable::type::value, "result type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+#ifdef __has_cpp_attribute(likely)
>+//This if statement is by default bad since it affects CPU pipeline.
>+//needs likely attribute or the branch misprediction penalty (100% misprediction rate probably) is HUGE
>+        [[likely]]
>+#endif
>+          std::__memmove<_IsMove>(to_address(__result), to_address(__first), _Num);
>+        return __result + _Num;
>+      }
>+      else
>+      return std::__copy_move<_IsMove, false,
>+#else
>       const bool __simple = (__is_trivially_copyable(_ValueTypeI)
> 			     && __is_pointer<_II>::__value
> 			     && __is_pointer<_OI>::__value
> 			     && __are_same<_ValueTypeI, _ValueTypeO>::__value);
>       return std::__copy_move<_IsMove, __simple,
>+#endif
> 			      _Category>::__copy_m(__first, __last, __result);
>     }
> 
>@@ -581,6 +638,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>    *  Note that the end of the output range is permitted to be contained
>    *  within [first,last).
>   */
>+
>   template<typename _II, typename _OI>
>     _GLIBCXX20_CONSTEXPR
>     inline _OI
>@@ -595,7 +653,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       return std::__copy_move_a<__is_move_iterator<_II>::__value>
> 	     (std::__miter_base(__first), std::__miter_base(__last), __result);
>     }
>-
> #if __cplusplus >= 201103L
>   /**
>    *  @brief Moves the range [first,last) into result.
>@@ -698,6 +755,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     };
> #endif
> 
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -714,11 +772,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
>+	  if (!_Num)
>+      return __result - _Num;
>+    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
> 	  return __result - _Num;
> 	}
>     };
>+#endif
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>     _GLIBCXX20_CONSTEXPR
>@@ -728,21 +788,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       typedef typename iterator_traits<_BI1>::value_type _ValueType1;
>       typedef typename iterator_traits<_BI2>::value_type _ValueType2;
>       typedef typename iterator_traits<_BI1>::iterator_category _Category;
>-      const bool __simple = (__is_trivially_copyable(_ValueType1)
>-			     && __is_pointer<_BI1>::__value
>-			     && __is_pointer<_BI2>::__value
>-			     && __are_same<_ValueType1, _ValueType2>::__value);
>-
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>-	return std::__copy_move_backward<true, false,
>+      	return std::__copy_move_backward<true, false,
> 			      _Category>::__copy_move_b(__first, __last,
> 							__result);
>+      else
>+      {
> #endif
>+#if __cpp_lib_concepts 
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueType1>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueType2>)&&
>+          contiguous_iterator<_BI1>&&
>+          contiguous_iterator<_BI2>&&
>+          (same_as<_ValueType1,_ValueType2>
>+          ||(integral<_ValueType1>&&integral<_ValueType2>&&
>+              sizeof(_ValueType1)==sizeof(_ValueType2))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                is_move_assignable<_ValueType2>,
>+                is_copy_assignable<_ValueType2>>;
>+        // trivial types can have deleted assignment
>+        static_assert( __assignable::type::value, "type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+          #ifdef __has_cpp_attribute(likely)
>+            [[likely]]
>+          #endif
>+          std::__memmove<_IsMove>(to_address(__result) - _Num, to_address(__first), _Num);
>+        return __result - _Num; 
>+      }
>+#else
>+      const bool __simple = (
>+#if __cplusplus >= 201103L
>+        (!is_volatile<typename std::remove_reference<decltype(*__first)>::type>::value)&&
>+#endif
>+        __is_trivially_copyable(_ValueType1)
>+			     && __is_pointer<_BI1>::__value
>+			     && __is_pointer<_BI2>::__value
>+			     && __are_same<_ValueType1, _ValueType2>::__value);
>       return std::__copy_move_backward<_IsMove, __simple,
> 				       _Category>::__copy_move_b(__first,
> 								 __last,
> 								 __result);
>+#endif
>+#ifdef __cpp_lib_is_constant_evaluated
>+      }
>+#endif
>     }
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>@@ -908,16 +1003,33 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     }
> 
>   // Specialization: for char types we can use memset.
>+
>   template<typename _Tp>
>+    _GLIBCXX20_CONSTEXPR
>     inline typename
>     __gnu_cxx::__enable_if<__is_byte<_Tp>::__value, void>::__type
>     __fill_a1(_Tp* __first, _Tp* __last, const _Tp& __c)
>     {
>-      const _Tp __tmp = __c;
>+      _Tp const __tmp = __c;
>+#ifdef __cpp_lib_is_constant_evaluated
>+    if (std::is_constant_evaluated())
>+    {
>+      for (; __first != __last; ++__first)
>+	*__first = __tmp;
>+    }
>+    else
>+    {
>+#endif
>       if (const size_t __len = __last - __first)
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	__builtin_memset(__first, static_cast<unsigned char>(__tmp), __len);
>+#ifdef __cpp_lib_is_constant_evaluated
>     }
>-
>+#endif
>+    }
>+#ifndef __cpp_lib_concepts
>   template<typename _Ite, typename _Cont, typename _Tp>
>     _GLIBCXX20_CONSTEXPR
>     inline void
>@@ -925,7 +1037,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	      ::__gnu_cxx::__normal_iterator<_Ite, _Cont> __last,
> 	      const _Tp& __value)
>     { std::__fill_a1(__first.base(), __last.base(), __value); }
>-
>+#endif
>   template<typename _Tp, typename _VTp>
>     void
>     __fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>&,
>@@ -936,7 +1048,14 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     _GLIBCXX20_CONSTEXPR
>     inline void
>     __fill_a(_FIte __first, _FIte __last, const _Tp& __value)
>-    { std::__fill_a1(__first, __last, __value); }
>+    {
>+#ifdef __cpp_lib_concepts
>+      if constexpr(contiguous_iterator<_FIte>)
>+        std::__fill_a1(to_address(__first), to_address(__last), __value);
>+      else
>+#endif
>+        std::__fill_a1(__first, __last, __value);
>+    }
> 
>   template<typename _Ite, typename _Seq, typename _Cat, typename _Tp>
>     void
>@@ -1306,6 +1425,9 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  const size_t __len1 = __last1 - __first1;
> 	  const size_t __len2 = __last2 - __first2;
> 	  if (const size_t __len = std::min(__len1, __len2))
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	    if (int __result = std::__memcmp(__first1, __first2, __len))
> 	      return __result < 0;
> 	  return __len1 < __len2;
>@@ -1733,7 +1855,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
> 		if (__len)
> 		  {
> 		    const auto __c
>-		      = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
>+		      = __builtin_memcmp(to_address(__first1), to_address(__first2), __len) <=> 0;
>+//&*__first1 is not correct and would crash program. Must use to_address()
>+//since a lot of debugging iterator would not be allowed to dereference __first2.
>+//It is undefined behavior to derefernce sentinal iterators
>+//For example, VC's implementation
> 		    if (__c != 0)
> 		      return __c;
> 		  }
>-- 
>2.24.0.windows.2
>
Comment 22 fdlbxtqi 2019-12-29 15:06:48 UTC
Comment on attachment 47559 [details]
An untested patch

>From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
>From: expnkx <unlvsur@live.com>
>Date: Sun, 29 Dec 2019 09:49:19 -0500
>Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
> std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
> behavior of &*end constexpr std::fill Greatly improve performance of
> std::copyXXX and std::uninitialized_copyXX for different types and apply
> pipeline optimization to reduce branch misprediction
>
>---
> libstdc++-v3/include/bits/stl_algobase.h | 170 ++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 22 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
>index 40d056ae8d5..01672a8f232 100644
>--- a/libstdc++-v3/include/bits/stl_algobase.h
>+++ b/libstdc++-v3/include/bits/stl_algobase.h
>@@ -1,6 +1,6 @@
> // Core algorithmic facilities -*- C++ -*-
> 
>-// Copyright (C) 2001-2019 Free Software Foundation, Inc.
>+// Copyright (C) 2001-2020 Free Software Foundation, Inc.
> //
> // This file is part of the GNU ISO C++ Library.  This library is free
> // software; you can redistribute it and/or modify it under the
>@@ -84,10 +84,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove, typename _Tp>
>+  _GLIBCXX14_CONSTEXPR
>+    inline void volatile*
>+    __memmove(_Tp volatile* __dst, _Tp volatile const* __src, size_t __num)
>+    {
>+	  for(; __num > 0; --__num)
>+	    {
>+#if __cplusplus >= 201103L
>+	      if constexpr (_IsMove)
>+		*__dst = std::move(*__src);
>+	      else
>+#endif
>+		*__dst = *__src;
>+	      ++__src;
>+	      ++__dst;
>+	    }
>+	  return __dst;
>+    }
>+#endif
>+  template<bool _IsMove, typename _Tp,typename _Tp_src>
>+/*
>+#ifdef __cpp_lib_concepts
>+    requires (is_trivially_copyable_v<_Tp>&&!is_volatile_v<_Tp>
>+      &&is_trivially_copyable_v<_Tp_src>&&!is_volatile_v<_Tp_src>
>+      &&(same_as<_Tp,_Tp_src>||(sizeof(_Tp)==sizeof(_Tp_src)&&integral<_Tp>&&integral<_Tp_src>)))
>+#endif
>+*/
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>-    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>+    __memmove(_Tp* __dst, _Tp_src const* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>@@ -106,9 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>-      return __dst;
>     }
>-
>   /*
>    * A constexpr wrapper for __builtin_memcmp.
>    * @param __num The number of elements of type _Tp (not bytes).
>@@ -431,7 +456,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	}
>     };
> #endif
>-
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -448,12 +473,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result, __first, _Num);
>+	  if (!_Num)
>+  	  return __result + _Num;
>+//since before C++20, we do not have [[likely]] attribute. We need to do it manually
>+    std::__memmove<_IsMove>(__result, __first, _Num);
> 	  return __result + _Num;
> 	}
>     };
>-
>+#endif
>   // Helpers for streambuf iterators (either istream or ostream).
>   // NB: avoid including <iosfwd>, relatively large.
>   template<typename _CharT>
>@@ -491,11 +518,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       typedef typename iterator_traits<_II>::value_type _ValueTypeI;
>       typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
>       typedef typename iterator_traits<_II>::iterator_category _Category;
>+#ifdef __cpp_lib_concepts
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueTypeI>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueTypeO>)&&
>+          contiguous_iterator<_II>&&
>+          contiguous_iterator<_OI>&&
>+          (same_as<_ValueTypeI,_ValueTypeO>
>+          ||(integral<_ValueTypeI>&&integral<_ValueTypeO>&&
>+              sizeof(_ValueTypeI)==sizeof(_ValueTypeO))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                 is_move_assignable<_OI>,
>+                is_copy_assignable<_OI>>;
>+        static_assert( __assignable::type::value, "result type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+#ifdef __has_cpp_attribute(likely)
>+//This if statement is by default bad since it affects CPU pipeline.
>+//needs likely attribute or the branch misprediction penalty (100% misprediction rate probably) is HUGE
>+        [[likely]]
>+#endif
>+          std::__memmove<_IsMove>(to_address(__result), to_address(__first), _Num);
>+        return __result + _Num;
>+      }
>+      else
>+      return std::__copy_move<_IsMove, false,
>+#else
>       const bool __simple = (__is_trivially_copyable(_ValueTypeI)
> 			     && __is_pointer<_II>::__value
> 			     && __is_pointer<_OI>::__value
> 			     && __are_same<_ValueTypeI, _ValueTypeO>::__value);
>       return std::__copy_move<_IsMove, __simple,
>+#endif
> 			      _Category>::__copy_m(__first, __last, __result);
>     }
> 
>@@ -581,6 +638,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>    *  Note that the end of the output range is permitted to be contained
>    *  within [first,last).
>   */
>+
>   template<typename _II, typename _OI>
>     _GLIBCXX20_CONSTEXPR
>     inline _OI
>@@ -595,7 +653,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       return std::__copy_move_a<__is_move_iterator<_II>::__value>
> 	     (std::__miter_base(__first), std::__miter_base(__last), __result);
>     }
>-
> #if __cplusplus >= 201103L
>   /**
>    *  @brief Moves the range [first,last) into result.
>@@ -698,6 +755,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     };
> #endif
> 
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -714,11 +772,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
>+	  if (!_Num)
>+      return __result - _Num;
>+    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
> 	  return __result - _Num;
> 	}
>     };
>+#endif
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>     _GLIBCXX20_CONSTEXPR
>@@ -728,21 +788,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       typedef typename iterator_traits<_BI1>::value_type _ValueType1;
>       typedef typename iterator_traits<_BI2>::value_type _ValueType2;
>       typedef typename iterator_traits<_BI1>::iterator_category _Category;
>-      const bool __simple = (__is_trivially_copyable(_ValueType1)
>-			     && __is_pointer<_BI1>::__value
>-			     && __is_pointer<_BI2>::__value
>-			     && __are_same<_ValueType1, _ValueType2>::__value);
>-
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>-	return std::__copy_move_backward<true, false,
>+      	return std::__copy_move_backward<true, false,
> 			      _Category>::__copy_move_b(__first, __last,
> 							__result);
>+      else
>+      {
> #endif
>+#if __cpp_lib_concepts 
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueType1>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueType2>)&&
>+          contiguous_iterator<_BI1>&&
>+          contiguous_iterator<_BI2>&&
>+          (same_as<_ValueType1,_ValueType2>
>+          ||(integral<_ValueType1>&&integral<_ValueType2>&&
>+              sizeof(_ValueType1)==sizeof(_ValueType2))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                is_move_assignable<_ValueType2>,
>+                is_copy_assignable<_ValueType2>>;
>+        // trivial types can have deleted assignment
>+        static_assert( __assignable::type::value, "type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+          #ifdef __has_cpp_attribute(likely)
>+            [[likely]]
>+          #endif
>+          std::__memmove<_IsMove>(to_address(__result) - _Num, to_address(__first), _Num);
>+        return __result - _Num; 
>+      }
>+#else
>+      const bool __simple = (
>+#if __cplusplus >= 201103L
>+        (!is_volatile<typename std::remove_reference<decltype(*__first)>::type>::value)&&
>+#endif
>+        __is_trivially_copyable(_ValueType1)
>+			     && __is_pointer<_BI1>::__value
>+			     && __is_pointer<_BI2>::__value
>+			     && __are_same<_ValueType1, _ValueType2>::__value);
>       return std::__copy_move_backward<_IsMove, __simple,
> 				       _Category>::__copy_move_b(__first,
> 								 __last,
> 								 __result);
>+#endif
>+#ifdef __cpp_lib_is_constant_evaluated
>+      }
>+#endif
>     }
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>@@ -908,16 +1003,33 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     }
> 
>   // Specialization: for char types we can use memset.
>+
>   template<typename _Tp>
>+    _GLIBCXX20_CONSTEXPR
>     inline typename
>     __gnu_cxx::__enable_if<__is_byte<_Tp>::__value, void>::__type
>     __fill_a1(_Tp* __first, _Tp* __last, const _Tp& __c)
>     {
>-      const _Tp __tmp = __c;
>+      _Tp const __tmp = __c;
>+#ifdef __cpp_lib_is_constant_evaluated
>+    if (std::is_constant_evaluated())
>+    {
>+      for (; __first != __last; ++__first)
>+	*__first = __tmp;
>+    }
>+    else
>+    {
>+#endif
>       if (const size_t __len = __last - __first)
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	__builtin_memset(__first, static_cast<unsigned char>(__tmp), __len);
>+#ifdef __cpp_lib_is_constant_evaluated
>     }
>-
>+#endif
>+    }
>+#ifndef __cpp_lib_concepts
>   template<typename _Ite, typename _Cont, typename _Tp>
>     _GLIBCXX20_CONSTEXPR
>     inline void
>@@ -925,7 +1037,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	      ::__gnu_cxx::__normal_iterator<_Ite, _Cont> __last,
> 	      const _Tp& __value)
>     { std::__fill_a1(__first.base(), __last.base(), __value); }
>-
>+#endif
>   template<typename _Tp, typename _VTp>
>     void
>     __fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>&,
>@@ -936,7 +1048,14 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     _GLIBCXX20_CONSTEXPR
>     inline void
>     __fill_a(_FIte __first, _FIte __last, const _Tp& __value)
>-    { std::__fill_a1(__first, __last, __value); }
>+    {
>+#ifdef __cpp_lib_concepts
>+      if constexpr(contiguous_iterator<_FIte>)
>+        std::__fill_a1(to_address(__first), to_address(__last), __value);
>+      else
>+#endif
>+        std::__fill_a1(__first, __last, __value);
>+    }
> 
>   template<typename _Ite, typename _Seq, typename _Cat, typename _Tp>
>     void
>@@ -1306,6 +1425,9 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  const size_t __len1 = __last1 - __first1;
> 	  const size_t __len2 = __last2 - __first2;
> 	  if (const size_t __len = std::min(__len1, __len2))
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	    if (int __result = std::__memcmp(__first1, __first2, __len))
> 	      return __result < 0;
> 	  return __len1 < __len2;
>@@ -1733,7 +1855,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
> 		if (__len)
> 		  {
> 		    const auto __c
>-		      = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
>+		      = __builtin_memcmp(to_address(__first1), to_address(__first2), __len) <=> 0;
>+//&*__first1 is not correct and would crash program. Must use to_address()
>+//since a lot of debugging iterator would not be allowed to dereference __first2.
>+//It is undefined behavior to derefernce sentinal iterators
>+//For example, VC's implementation
> 		    if (__c != 0)
> 		      return __c;
> 		  }
>-- 
>2.24.0.windows.2
>
Comment 23 fdlbxtqi 2019-12-29 15:06:58 UTC
Comment on attachment 47559 [details]
An untested patch

>From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
>From: expnkx <unlvsur@live.com>
>Date: Sun, 29 Dec 2019 09:49:19 -0500
>Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
> std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
> behavior of &*end constexpr std::fill Greatly improve performance of
> std::copyXXX and std::uninitialized_copyXX for different types and apply
> pipeline optimization to reduce branch misprediction
>
>---
> libstdc++-v3/include/bits/stl_algobase.h | 170 ++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 22 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
>index 40d056ae8d5..01672a8f232 100644
>--- a/libstdc++-v3/include/bits/stl_algobase.h
>+++ b/libstdc++-v3/include/bits/stl_algobase.h
>@@ -1,6 +1,6 @@
> // Core algorithmic facilities -*- C++ -*-
> 
>-// Copyright (C) 2001-2019 Free Software Foundation, Inc.
>+// Copyright (C) 2001-2020 Free Software Foundation, Inc.
> //
> // This file is part of the GNU ISO C++ Library.  This library is free
> // software; you can redistribute it and/or modify it under the
>@@ -84,10 +84,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove, typename _Tp>
>+  _GLIBCXX14_CONSTEXPR
>+    inline void volatile*
>+    __memmove(_Tp volatile* __dst, _Tp volatile const* __src, size_t __num)
>+    {
>+	  for(; __num > 0; --__num)
>+	    {
>+#if __cplusplus >= 201103L
>+	      if constexpr (_IsMove)
>+		*__dst = std::move(*__src);
>+	      else
>+#endif
>+		*__dst = *__src;
>+	      ++__src;
>+	      ++__dst;
>+	    }
>+	  return __dst;
>+    }
>+#endif
>+  template<bool _IsMove, typename _Tp,typename _Tp_src>
>+/*
>+#ifdef __cpp_lib_concepts
>+    requires (is_trivially_copyable_v<_Tp>&&!is_volatile_v<_Tp>
>+      &&is_trivially_copyable_v<_Tp_src>&&!is_volatile_v<_Tp_src>
>+      &&(same_as<_Tp,_Tp_src>||(sizeof(_Tp)==sizeof(_Tp_src)&&integral<_Tp>&&integral<_Tp_src>)))
>+#endif
>+*/
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>-    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>+    __memmove(_Tp* __dst, _Tp_src const* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>@@ -106,9 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>-      return __dst;
>     }
>-
>   /*
>    * A constexpr wrapper for __builtin_memcmp.
>    * @param __num The number of elements of type _Tp (not bytes).
>@@ -431,7 +456,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	}
>     };
> #endif
>-
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -448,12 +473,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result, __first, _Num);
>+	  if (!_Num)
>+  	  return __result + _Num;
>+//since before C++20, we do not have [[likely]] attribute. We need to do it manually
>+    std::__memmove<_IsMove>(__result, __first, _Num);
> 	  return __result + _Num;
> 	}
>     };
>-
>+#endif
>   // Helpers for streambuf iterators (either istream or ostream).
>   // NB: avoid including <iosfwd>, relatively large.
>   template<typename _CharT>
>@@ -491,11 +518,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       typedef typename iterator_traits<_II>::value_type _ValueTypeI;
>       typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
>       typedef typename iterator_traits<_II>::iterator_category _Category;
>+#ifdef __cpp_lib_concepts
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueTypeI>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueTypeO>)&&
>+          contiguous_iterator<_II>&&
>+          contiguous_iterator<_OI>&&
>+          (same_as<_ValueTypeI,_ValueTypeO>
>+          ||(integral<_ValueTypeI>&&integral<_ValueTypeO>&&
>+              sizeof(_ValueTypeI)==sizeof(_ValueTypeO))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                 is_move_assignable<_OI>,
>+                is_copy_assignable<_OI>>;
>+        static_assert( __assignable::type::value, "result type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+#ifdef __has_cpp_attribute(likely)
>+//This if statement is by default bad since it affects CPU pipeline.
>+//needs likely attribute or the branch misprediction penalty (100% misprediction rate probably) is HUGE
>+        [[likely]]
>+#endif
>+          std::__memmove<_IsMove>(to_address(__result), to_address(__first), _Num);
>+        return __result + _Num;
>+      }
>+      else
>+      return std::__copy_move<_IsMove, false,
>+#else
>       const bool __simple = (__is_trivially_copyable(_ValueTypeI)
> 			     && __is_pointer<_II>::__value
> 			     && __is_pointer<_OI>::__value
> 			     && __are_same<_ValueTypeI, _ValueTypeO>::__value);
>       return std::__copy_move<_IsMove, __simple,
>+#endif
> 			      _Category>::__copy_m(__first, __last, __result);
>     }
> 
>@@ -581,6 +638,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>    *  Note that the end of the output range is permitted to be contained
>    *  within [first,last).
>   */
>+
>   template<typename _II, typename _OI>
>     _GLIBCXX20_CONSTEXPR
>     inline _OI
>@@ -595,7 +653,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       return std::__copy_move_a<__is_move_iterator<_II>::__value>
> 	     (std::__miter_base(__first), std::__miter_base(__last), __result);
>     }
>-
> #if __cplusplus >= 201103L
>   /**
>    *  @brief Moves the range [first,last) into result.
>@@ -698,6 +755,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     };
> #endif
> 
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -714,11 +772,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
>+	  if (!_Num)
>+      return __result - _Num;
>+    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
> 	  return __result - _Num;
> 	}
>     };
>+#endif
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>     _GLIBCXX20_CONSTEXPR
>@@ -728,21 +788,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       typedef typename iterator_traits<_BI1>::value_type _ValueType1;
>       typedef typename iterator_traits<_BI2>::value_type _ValueType2;
>       typedef typename iterator_traits<_BI1>::iterator_category _Category;
>-      const bool __simple = (__is_trivially_copyable(_ValueType1)
>-			     && __is_pointer<_BI1>::__value
>-			     && __is_pointer<_BI2>::__value
>-			     && __are_same<_ValueType1, _ValueType2>::__value);
>-
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>-	return std::__copy_move_backward<true, false,
>+      	return std::__copy_move_backward<true, false,
> 			      _Category>::__copy_move_b(__first, __last,
> 							__result);
>+      else
>+      {
> #endif
>+#if __cpp_lib_concepts 
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueType1>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueType2>)&&
>+          contiguous_iterator<_BI1>&&
>+          contiguous_iterator<_BI2>&&
>+          (same_as<_ValueType1,_ValueType2>
>+          ||(integral<_ValueType1>&&integral<_ValueType2>&&
>+              sizeof(_ValueType1)==sizeof(_ValueType2))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                is_move_assignable<_ValueType2>,
>+                is_copy_assignable<_ValueType2>>;
>+        // trivial types can have deleted assignment
>+        static_assert( __assignable::type::value, "type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+          #ifdef __has_cpp_attribute(likely)
>+            [[likely]]
>+          #endif
>+          std::__memmove<_IsMove>(to_address(__result) - _Num, to_address(__first), _Num);
>+        return __result - _Num; 
>+      }
>+#else
>+      const bool __simple = (
>+#if __cplusplus >= 201103L
>+        (!is_volatile<typename std::remove_reference<decltype(*__first)>::type>::value)&&
>+#endif
>+        __is_trivially_copyable(_ValueType1)
>+			     && __is_pointer<_BI1>::__value
>+			     && __is_pointer<_BI2>::__value
>+			     && __are_same<_ValueType1, _ValueType2>::__value);
>       return std::__copy_move_backward<_IsMove, __simple,
> 				       _Category>::__copy_move_b(__first,
> 								 __last,
> 								 __result);
>+#endif
>+#ifdef __cpp_lib_is_constant_evaluated
>+      }
>+#endif
>     }
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>@@ -908,16 +1003,33 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     }
> 
>   // Specialization: for char types we can use memset.
>+
>   template<typename _Tp>
>+    _GLIBCXX20_CONSTEXPR
>     inline typename
>     __gnu_cxx::__enable_if<__is_byte<_Tp>::__value, void>::__type
>     __fill_a1(_Tp* __first, _Tp* __last, const _Tp& __c)
>     {
>-      const _Tp __tmp = __c;
>+      _Tp const __tmp = __c;
>+#ifdef __cpp_lib_is_constant_evaluated
>+    if (std::is_constant_evaluated())
>+    {
>+      for (; __first != __last; ++__first)
>+	*__first = __tmp;
>+    }
>+    else
>+    {
>+#endif
>       if (const size_t __len = __last - __first)
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	__builtin_memset(__first, static_cast<unsigned char>(__tmp), __len);
>+#ifdef __cpp_lib_is_constant_evaluated
>     }
>-
>+#endif
>+    }
>+#ifndef __cpp_lib_concepts
>   template<typename _Ite, typename _Cont, typename _Tp>
>     _GLIBCXX20_CONSTEXPR
>     inline void
>@@ -925,7 +1037,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	      ::__gnu_cxx::__normal_iterator<_Ite, _Cont> __last,
> 	      const _Tp& __value)
>     { std::__fill_a1(__first.base(), __last.base(), __value); }
>-
>+#endif
>   template<typename _Tp, typename _VTp>
>     void
>     __fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>&,
>@@ -936,7 +1048,14 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     _GLIBCXX20_CONSTEXPR
>     inline void
>     __fill_a(_FIte __first, _FIte __last, const _Tp& __value)
>-    { std::__fill_a1(__first, __last, __value); }
>+    {
>+#ifdef __cpp_lib_concepts
>+      if constexpr(contiguous_iterator<_FIte>)
>+        std::__fill_a1(to_address(__first), to_address(__last), __value);
>+      else
>+#endif
>+        std::__fill_a1(__first, __last, __value);
>+    }
> 
>   template<typename _Ite, typename _Seq, typename _Cat, typename _Tp>
>     void
>@@ -1306,6 +1425,9 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  const size_t __len1 = __last1 - __first1;
> 	  const size_t __len2 = __last2 - __first2;
> 	  if (const size_t __len = std::min(__len1, __len2))
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	    if (int __result = std::__memcmp(__first1, __first2, __len))
> 	      return __result < 0;
> 	  return __len1 < __len2;
>@@ -1733,7 +1855,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
> 		if (__len)
> 		  {
> 		    const auto __c
>-		      = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
>+		      = __builtin_memcmp(to_address(__first1), to_address(__first2), __len) <=> 0;
>+//&*__first1 is not correct and would crash program. Must use to_address()
>+//since a lot of debugging iterator would not be allowed to dereference __first2.
>+//It is undefined behavior to derefernce sentinal iterators
>+//For example, VC's implementation
> 		    if (__c != 0)
> 		      return __c;
> 		  }
>-- 
>2.24.0.windows.2
>
Comment 24 fdlbxtqi 2019-12-29 15:07:16 UTC
Comment on attachment 47559 [details]
An untested patch

>From 1dfd714e1f29e229d69a0c7f6f84bf05dd4ee85d Mon Sep 17 00:00:00 2001
>From: expnkx <unlvsur@live.com>
>Date: Sun, 29 Dec 2019 09:49:19 -0500
>Subject: [PATCH] Untested patch fix volatile bug of std::copyXXX and
> std::uninitialized_copyXXX Support custom contiguous_iterator Fix undefined
> behavior of &*end constexpr std::fill Greatly improve performance of
> std::copyXXX and std::uninitialized_copyXX for different types and apply
> pipeline optimization to reduce branch misprediction
>
>---
> libstdc++-v3/include/bits/stl_algobase.h | 170 ++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 22 deletions(-)
>
>diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
>index 40d056ae8d5..01672a8f232 100644
>--- a/libstdc++-v3/include/bits/stl_algobase.h
>+++ b/libstdc++-v3/include/bits/stl_algobase.h
>@@ -1,6 +1,6 @@
> // Core algorithmic facilities -*- C++ -*-
> 
>-// Copyright (C) 2001-2019 Free Software Foundation, Inc.
>+// Copyright (C) 2001-2020 Free Software Foundation, Inc.
> //
> // This file is part of the GNU ISO C++ Library.  This library is free
> // software; you can redistribute it and/or modify it under the
>@@ -84,10 +84,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove, typename _Tp>
>+  _GLIBCXX14_CONSTEXPR
>+    inline void volatile*
>+    __memmove(_Tp volatile* __dst, _Tp volatile const* __src, size_t __num)
>+    {
>+	  for(; __num > 0; --__num)
>+	    {
>+#if __cplusplus >= 201103L
>+	      if constexpr (_IsMove)
>+		*__dst = std::move(*__src);
>+	      else
>+#endif
>+		*__dst = *__src;
>+	      ++__src;
>+	      ++__dst;
>+	    }
>+	  return __dst;
>+    }
>+#endif
>+  template<bool _IsMove, typename _Tp,typename _Tp_src>
>+/*
>+#ifdef __cpp_lib_concepts
>+    requires (is_trivially_copyable_v<_Tp>&&!is_volatile_v<_Tp>
>+      &&is_trivially_copyable_v<_Tp_src>&&!is_volatile_v<_Tp_src>
>+      &&(same_as<_Tp,_Tp_src>||(sizeof(_Tp)==sizeof(_Tp_src)&&integral<_Tp>&&integral<_Tp_src>)))
>+#endif
>+*/
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>-    __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>+    __memmove(_Tp* __dst, _Tp_src const* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>@@ -106,9 +133,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>-      return __dst;
>     }
>-
>   /*
>    * A constexpr wrapper for __builtin_memcmp.
>    * @param __num The number of elements of type _Tp (not bytes).
>@@ -431,7 +456,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	}
>     };
> #endif
>-
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -448,12 +473,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result, __first, _Num);
>+	  if (!_Num)
>+  	  return __result + _Num;
>+//since before C++20, we do not have [[likely]] attribute. We need to do it manually
>+    std::__memmove<_IsMove>(__result, __first, _Num);
> 	  return __result + _Num;
> 	}
>     };
>-
>+#endif
>   // Helpers for streambuf iterators (either istream or ostream).
>   // NB: avoid including <iosfwd>, relatively large.
>   template<typename _CharT>
>@@ -491,11 +518,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>       typedef typename iterator_traits<_II>::value_type _ValueTypeI;
>       typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
>       typedef typename iterator_traits<_II>::iterator_category _Category;
>+#ifdef __cpp_lib_concepts
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueTypeI>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueTypeO>)&&
>+          contiguous_iterator<_II>&&
>+          contiguous_iterator<_OI>&&
>+          (same_as<_ValueTypeI,_ValueTypeO>
>+          ||(integral<_ValueTypeI>&&integral<_ValueTypeO>&&
>+              sizeof(_ValueTypeI)==sizeof(_ValueTypeO))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                 is_move_assignable<_OI>,
>+                is_copy_assignable<_OI>>;
>+        static_assert( __assignable::type::value, "result type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+#ifdef __has_cpp_attribute(likely)
>+//This if statement is by default bad since it affects CPU pipeline.
>+//needs likely attribute or the branch misprediction penalty (100% misprediction rate probably) is HUGE
>+        [[likely]]
>+#endif
>+          std::__memmove<_IsMove>(to_address(__result), to_address(__first), _Num);
>+        return __result + _Num;
>+      }
>+      else
>+      return std::__copy_move<_IsMove, false,
>+#else
>       const bool __simple = (__is_trivially_copyable(_ValueTypeI)
> 			     && __is_pointer<_II>::__value
> 			     && __is_pointer<_OI>::__value
> 			     && __are_same<_ValueTypeI, _ValueTypeO>::__value);
>       return std::__copy_move<_IsMove, __simple,
>+#endif
> 			      _Category>::__copy_m(__first, __last, __result);
>     }
> 
>@@ -581,6 +638,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>    *  Note that the end of the output range is permitted to be contained
>    *  within [first,last).
>   */
>+
>   template<typename _II, typename _OI>
>     _GLIBCXX20_CONSTEXPR
>     inline _OI
>@@ -595,7 +653,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       return std::__copy_move_a<__is_move_iterator<_II>::__value>
> 	     (std::__miter_base(__first), std::__miter_base(__last), __result);
>     }
>-
> #if __cplusplus >= 201103L
>   /**
>    *  @brief Moves the range [first,last) into result.
>@@ -698,6 +755,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     };
> #endif
> 
>+#ifndef __cpp_lib_concepts
>   template<bool _IsMove>
>     struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
>     {
>@@ -714,11 +772,13 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  static_assert( __assignable::type::value, "type is not assignable" );
> #endif
> 	  const ptrdiff_t _Num = __last - __first;
>-	  if (_Num)
>-	    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
>+	  if (!_Num)
>+      return __result - _Num;
>+    std::__memmove<_IsMove>(__result - _Num, __first, _Num);
> 	  return __result - _Num;
> 	}
>     };
>+#endif
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>     _GLIBCXX20_CONSTEXPR
>@@ -728,21 +788,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>       typedef typename iterator_traits<_BI1>::value_type _ValueType1;
>       typedef typename iterator_traits<_BI2>::value_type _ValueType2;
>       typedef typename iterator_traits<_BI1>::iterator_category _Category;
>-      const bool __simple = (__is_trivially_copyable(_ValueType1)
>-			     && __is_pointer<_BI1>::__value
>-			     && __is_pointer<_BI2>::__value
>-			     && __are_same<_ValueType1, _ValueType2>::__value);
>-
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
>-	return std::__copy_move_backward<true, false,
>+      	return std::__copy_move_backward<true, false,
> 			      _Category>::__copy_move_b(__first, __last,
> 							__result);
>+      else
>+      {
> #endif
>+#if __cpp_lib_concepts 
>+      if constexpr(
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__first)>>)&&
>+          is_trivially_copyable_v<_ValueType1>)&&
>+          ((!is_volatile_v<std::remove_reference_t<decltype(*__result)>>)&&
>+          is_trivially_copyable_v<_ValueType2>)&&
>+          contiguous_iterator<_BI1>&&
>+          contiguous_iterator<_BI2>&&
>+          (same_as<_ValueType1,_ValueType2>
>+          ||(integral<_ValueType1>&&integral<_ValueType2>&&
>+              sizeof(_ValueType1)==sizeof(_ValueType2))))
>+      {
>+        using __assignable = conditional<_IsMove,
>+                is_move_assignable<_ValueType2>,
>+                is_copy_assignable<_ValueType2>>;
>+        // trivial types can have deleted assignment
>+        static_assert( __assignable::type::value, "type is not assignable" );
>+        ptrdiff_t const _Num = __last - __first;
>+        if (_Num)
>+          #ifdef __has_cpp_attribute(likely)
>+            [[likely]]
>+          #endif
>+          std::__memmove<_IsMove>(to_address(__result) - _Num, to_address(__first), _Num);
>+        return __result - _Num; 
>+      }
>+#else
>+      const bool __simple = (
>+#if __cplusplus >= 201103L
>+        (!is_volatile<typename std::remove_reference<decltype(*__first)>::type>::value)&&
>+#endif
>+        __is_trivially_copyable(_ValueType1)
>+			     && __is_pointer<_BI1>::__value
>+			     && __is_pointer<_BI2>::__value
>+			     && __are_same<_ValueType1, _ValueType2>::__value);
>       return std::__copy_move_backward<_IsMove, __simple,
> 				       _Category>::__copy_move_b(__first,
> 								 __last,
> 								 __result);
>+#endif
>+#ifdef __cpp_lib_is_constant_evaluated
>+      }
>+#endif
>     }
> 
>   template<bool _IsMove, typename _BI1, typename _BI2>
>@@ -908,16 +1003,33 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     }
> 
>   // Specialization: for char types we can use memset.
>+
>   template<typename _Tp>
>+    _GLIBCXX20_CONSTEXPR
>     inline typename
>     __gnu_cxx::__enable_if<__is_byte<_Tp>::__value, void>::__type
>     __fill_a1(_Tp* __first, _Tp* __last, const _Tp& __c)
>     {
>-      const _Tp __tmp = __c;
>+      _Tp const __tmp = __c;
>+#ifdef __cpp_lib_is_constant_evaluated
>+    if (std::is_constant_evaluated())
>+    {
>+      for (; __first != __last; ++__first)
>+	*__first = __tmp;
>+    }
>+    else
>+    {
>+#endif
>       if (const size_t __len = __last - __first)
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	__builtin_memset(__first, static_cast<unsigned char>(__tmp), __len);
>+#ifdef __cpp_lib_is_constant_evaluated
>     }
>-
>+#endif
>+    }
>+#ifndef __cpp_lib_concepts
>   template<typename _Ite, typename _Cont, typename _Tp>
>     _GLIBCXX20_CONSTEXPR
>     inline void
>@@ -925,7 +1037,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	      ::__gnu_cxx::__normal_iterator<_Ite, _Cont> __last,
> 	      const _Tp& __value)
>     { std::__fill_a1(__first.base(), __last.base(), __value); }
>-
>+#endif
>   template<typename _Tp, typename _VTp>
>     void
>     __fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>&,
>@@ -936,7 +1048,14 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>     _GLIBCXX20_CONSTEXPR
>     inline void
>     __fill_a(_FIte __first, _FIte __last, const _Tp& __value)
>-    { std::__fill_a1(__first, __last, __value); }
>+    {
>+#ifdef __cpp_lib_concepts
>+      if constexpr(contiguous_iterator<_FIte>)
>+        std::__fill_a1(to_address(__first), to_address(__last), __value);
>+      else
>+#endif
>+        std::__fill_a1(__first, __last, __value);
>+    }
> 
>   template<typename _Ite, typename _Seq, typename _Cat, typename _Tp>
>     void
>@@ -1306,6 +1425,9 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> 	  const size_t __len1 = __last1 - __first1;
> 	  const size_t __len2 = __last2 - __first2;
> 	  if (const size_t __len = std::min(__len1, __len2))
>+#ifdef __has_cpp_attribute(likely)
>+        [[likely]]
>+#endif
> 	    if (int __result = std::__memcmp(__first1, __first2, __len))
> 	      return __result < 0;
> 	  return __len1 < __len2;
>@@ -1733,7 +1855,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
> 		if (__len)
> 		  {
> 		    const auto __c
>-		      = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
>+		      = __builtin_memcmp(to_address(__first1), to_address(__first2), __len) <=> 0;
>+//&*__first1 is not correct and would crash program. Must use to_address()
>+//since a lot of debugging iterator would not be allowed to dereference __first2.
>+//It is undefined behavior to derefernce sentinal iterators
>+//For example, VC's implementation
> 		    if (__c != 0)
> 		      return __c;
> 		  }
>-- 
>2.24.0.windows.2
>
Comment 25 fdlbxtqi 2019-12-29 15:08:35 UTC
Created attachment 47560 [details]
forgot to_address

2nd patch

I am going to run testsuites
Comment 26 Bernd Edlinger 2019-12-29 18:01:44 UTC
(In reply to fdlbxtqi from comment #2)
> Also find a bug of __memmove
> 
>   /*
>    * A constexpr wrapper for __builtin_memmove.
>    * @param __num The number of elements of type _Tp (not bytes).
>    */
>   template<bool _IsMove, typename _Tp>
>     _GLIBCXX14_CONSTEXPR
>     inline void*
>     __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
>     {
> #ifdef __cpp_lib_is_constant_evaluated
>       if (std::is_constant_evaluated())
> 	{
> 	  for(; __num > 0; --__num)
> 	    {
> 	      if constexpr (_IsMove)
> 		*__dst = std::move(*__src);
> 	      else
> 		*__dst = *__src;
> 	      ++__src;
> 	      ++__dst;
> 	    }
> 	  return __dst;
> 	}
>       else
> #endif
> 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
>       return __dst;
>     }
> 
> The last 2nd line return __dst is wrong. It should not exist.

Sorry, I don't know what this function is all about.
But to me the code in the ifdef looks totally bogus.
First it returns __dst+__num, while memmove is sopposed
to return __dst, and is is somehow clear that
__dst and __src do not overlap?

because if they do the loop would overwite the __dst buffer before
__src is fully copied?
Comment 27 fdlbxtqi 2019-12-30 02:14:22 UTC
(In reply to Bernd Edlinger from comment #26)
> (In reply to fdlbxtqi from comment #2)
> > Also find a bug of __memmove
> > 
> >   /*
> >    * A constexpr wrapper for __builtin_memmove.
> >    * @param __num The number of elements of type _Tp (not bytes).
> >    */
> >   template<bool _IsMove, typename _Tp>
> >     _GLIBCXX14_CONSTEXPR
> >     inline void*
> >     __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
> >     {
> > #ifdef __cpp_lib_is_constant_evaluated
> >       if (std::is_constant_evaluated())
> > 	{
> > 	  for(; __num > 0; --__num)
> > 	    {
> > 	      if constexpr (_IsMove)
> > 		*__dst = std::move(*__src);
> > 	      else
> > 		*__dst = *__src;
> > 	      ++__src;
> > 	      ++__dst;
> > 	    }
> > 	  return __dst;
> > 	}
> >       else
> > #endif
> > 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
> >       return __dst;
> >     }
> > 
> > The last 2nd line return __dst is wrong. It should not exist.
> 
> Sorry, I don't know what this function is all about.
> But to me the code in the ifdef looks totally bogus.
> First it returns __dst+__num, while memmove is sopposed
> to return __dst, and is is somehow clear that
> __dst and __src do not overlap?
> 
> because if they do the loop would overwite the __dst buffer before
> __src is fully copied?

Nope. It is C++20's new feature if(std::is_constant_evaluated()) which allows you to write functions can be used both in compile and runtime.
Comment 28 fdlbxtqi 2019-12-30 19:12:58 UTC
Created attachment 47570 [details]
Testsuite

Testsuite :

cqwrteur@DESKTOP-7H7UHQ9:~/libstdcpp_testsuite$ runtest --tool libstdc++
Using ../gcc/libstdc++-v3/testsuite/lib/libstdc++.exp as tool init file.
Test run by cqwrteur on Mon Dec 30 05:14:00 2019

Schedule of variations:
    unix

Running target unix
Running ../gcc/libstdc++-v3/testsuite/libstdc++-xmethods/xmethods.exp ..                                                     .

                === libstdc++ Summary ===

# of expected passes            12982
# of unexpected failures        6
# of expected failures          92
# of unsupported tests          614


After patch:



cqwrteur@DESKTOP-7H7UHQ9:~/libstdcpp_testsuite$ runtest --tool libstdc++ --srcdir=../gcc/libstdc++-v3/testsuite
Using ../gcc/libstdc++-v3/testsuite/lib/libstdc++.exp as tool init file.
Test run by cqwrteur on Mon Dec 30 09:58:12 2019
Native configuration is x86_64-pc-linux-gnu

                === libstdc++ tests ===

Schedule of variations:
    unix

Running target unix
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using ../gcc/libstdc++-v3/testsuite/config/default.exp as tool-and-target-specific interface file.
Running ../gcc/libstdc++-v3/testsuite/libstdc++-abi/abi.exp ...
Running ../gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp ...
FAIL: 18_support/set_terminate.cc execution test
FAIL: 18_support/set_unexpected.cc execution test
FAIL: 20_util/bind/ref_neg.cc (test for excess errors)
FAIL: 27_io/filesystem/path/concat/92853.cc execution test
FAIL: 27_io/filesystem/path/concat/path.cc execution test
FAIL: experimental/net/internet/resolver/ops/lookup.cc execution test
Running ../gcc/libstdc++-v3/testsuite/libstdc++-prettyprinters/prettyprinters.exp ...
Running ../gcc/libstdc++-v3/testsuite/libstdc++-xmethods/xmethods.exp ...

                === libstdc++ Summary ===

# of expected passes            12982
# of unexpected failures        6
# of expected failures          92
# of unsupported tests          614
cqwrteur@DESKTOP-7H7UHQ9:~/libstdcpp_testsuite$
Comment 29 fdlbxtqi 2019-12-30 19:15:53 UTC
(In reply to Marc Glisse from comment #17)
> (In reply to fdlbxtqi from comment #15)
> > What I am worried about is that whether revamping these functions would be a new wave of ABI breaking.
> 
> I don't foresee any ABI issue here. Do make sure your code doesn't break
> with -std=c++03, and run the testsuite before submitting it.

I think the test suite has passed. However, I am not sure since I have never used dejagnu before.

It looks like these test suites will fail in some cases even without my patch. (filesystem, network ts for example)

I need you to check them.
Comment 30 fdlbxtqi 2019-12-30 19:23:36 UTC
Created attachment 47571 [details]
Here is my stl_algobase.h after patch. You can try it directly.

Here is my stl_algobase.h after patch. You can try it directly.
Comment 31 Bernd Edlinger 2019-12-30 19:25:23 UTC
Yes, you usually need to make a full bootstrap / make check twice
which the same svn revision one with and one without your patch.
You also should make sure that the test case actually is able to fail
before your patch.
You run "$(srcdir)/contrib/test_summary -t" each time and
compare the output.
Comment 32 fdlbxtqi 2019-12-30 22:11:35 UTC
(In reply to Bernd Edlinger from comment #31)
> Yes, you usually need to make a full bootstrap / make check twice
> which the same svn revision one with and one without your patch.
> You also should make sure that the test case actually is able to fail
> before your patch.
> You run "$(srcdir)/contrib/test_summary -t" each time and
> compare the output.

I am going to bootstrap a new GCC version on my friend's new machine installed with CentOS
Comment 33 fdlbxtqi 2019-12-31 06:03:38 UTC
Created attachment 47574 [details]
copy_backward bug fixed for the last patch

going to further run testsuite
Comment 34 Bernd Edlinger 2019-12-31 07:25:31 UTC
(In reply to fdlbxtqi from comment #33)
> Created attachment 47574 [details]
> copy_backward bug fixed for the last patch
> 
> going to further run testsuite

Your test does not contain any test cases.
Comment 35 Jonathan Wakely 2020-02-26 14:40:47 UTC
(In reply to Bernd Edlinger from comment #26)
> (In reply to fdlbxtqi from comment #2)
> > Also find a bug of __memmove
> > 
> >   /*
> >    * A constexpr wrapper for __builtin_memmove.
> >    * @param __num The number of elements of type _Tp (not bytes).
> >    */
> >   template<bool _IsMove, typename _Tp>
> >     _GLIBCXX14_CONSTEXPR
> >     inline void*
> >     __memmove(_Tp* __dst, const _Tp* __src, size_t __num)
> >     {
> > #ifdef __cpp_lib_is_constant_evaluated
> >       if (std::is_constant_evaluated())
> > 	{
> > 	  for(; __num > 0; --__num)
> > 	    {
> > 	      if constexpr (_IsMove)
> > 		*__dst = std::move(*__src);
> > 	      else
> > 		*__dst = *__src;
> > 	      ++__src;
> > 	      ++__dst;
> > 	    }
> > 	  return __dst;
> > 	}
> >       else
> > #endif
> > 	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
> >       return __dst;
> >     }
> > 
> > The last 2nd line return __dst is wrong. It should not exist.
> 
> Sorry, I don't know what this function is all about.
> But to me the code in the ifdef looks totally bogus.
> First it returns __dst+__num, while memmove is sopposed
> to return __dst, and is is somehow clear that
> __dst and __src do not overlap?

Yes, it was only used by std::copy and std::move and they require that __dst is not in the range [__src, __src+__num). It's allowed for __dst+__num to be in that range, but the function works correctly in that case anyway.

The return value wasn't used.

But I've removed the entire function, so it doesn't matter now.
Comment 36 Jonathan Wakely 2020-02-26 15:00:00 UTC
Patches should be sent to the mailing list, and a copyright assignment (or public domain disclaimer) would be needed for a patch of this size. Otherwise, leave it to me to do once we start the GCC 11 dev phase.

Adding checks for types satisfying contiguous_iterator looks worthwhile, but should be a separate patch.

I think the condition for using memmove should be something like:

  __are_same<_ValueTypeI, _ValueTypeO>::__value
  || (__is_integer<_ValueTypeI>::__value
      && __is_integer<_ValueTypeo>::__value
      && sizeof(_ValueTypeI) == sizeof(_ValueTypeO))
Comment 37 Jonathan Wakely 2020-03-02 16:39:52 UTC
Similarly, the condition for using memcmp in std::equal is too strict:

      typedef typename iterator_traits<_II1>::value_type _ValueType1;
      typedef typename iterator_traits<_II2>::value_type _ValueType2;
      const bool __simple = ((__is_integer<_ValueType1>::__value
			      || __is_pointer<_ValueType1>::__value)
			     && __is_pointer<_II1>::__value
			     && __is_pointer<_II2>::__value
			     && __are_same<_ValueType1, _ValueType2>::__value);

If the types are integers of the same size and same signedness then we can use memcmp, they don't have to be exactly the same type.

And for lexicographical_compare we have:

      typedef typename iterator_traits<_II1>::value_type _ValueType1;
      typedef typename iterator_traits<_II2>::value_type _ValueType2;
      const bool __simple =
	(__is_byte<_ValueType1>::__value && __is_byte<_ValueType2>::__value
	 && !__gnu_cxx::__numeric_traits<_ValueType1>::__is_signed
	 && !__gnu_cxx::__numeric_traits<_ValueType2>::__is_signed
	 && __is_pointer<_II1>::__value
	 && __is_pointer<_II2>::__value);

On big endian targets we could also use memcmp for unsigned integer types larger than one byte.
Comment 38 Jonathan Wakely 2020-03-02 16:49:18 UTC
We could also use memcmp for std::equal when it's using std::equal_to<> or std::equal_to<_ValueType1> or std::equal_to<_ValueType2>, and for std::lexicographical_compare when it's using std::less<> or std::less<_ValueType1> or std::less<_ValueType2> (as long as the existing conditions are also met).

Currently we only use memcmp when those algos are used without a comparison function.
Comment 39 fdlbxtqi 2020-03-03 06:16:55 UTC
(In reply to Jonathan Wakely from comment #38)
> We could also use memcmp for std::equal when it's using std::equal_to<> or
> std::equal_to<_ValueType1> or std::equal_to<_ValueType2>, and for
> std::lexicographical_compare when it's using std::less<> or
> std::less<_ValueType1> or std::less<_ValueType2> (as long as the existing
> conditions are also met).
> 
> Currently we only use memcmp when those algos are used without a comparison
> function.

Hi Jonathan. Another bug I found before and you haven't fixed:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1706

const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;


You cannot call &*__first, &*__first2 for contiguous_iterator since a lot of implementations of contiguous iterators will check bounds at debugging mode (for example MSVC's STL vector implementation). Yours will break code. __second is not allowed to dereference, that is UB.

Instead, you should call to_address(__first),to_address(__second)
Comment 40 fdlbxtqi 2020-03-03 06:17:36 UTC
to_address(__first),to_address(__second)

to_address(__first1),to_address(__first2)
Comment 41 fdlbxtqi 2020-03-03 06:27:24 UTC
(In reply to Jonathan Wakely from comment #38)
> We could also use memcmp for std::equal when it's using std::equal_to<> or
> std::equal_to<_ValueType1> or std::equal_to<_ValueType2>, and for
> std::lexicographical_compare when it's using std::less<> or
> std::less<_ValueType1> or std::less<_ValueType2> (as long as the existing
> conditions are also met).
> 
> Currently we only use memcmp when those algos are used without a comparison
> function.

And the volatile bugs haven't fixed either. The code works fine in GCC 9.0. Now it failed to work. Even I swap the standard to C++17, it still does not work. Does deprecate volatile work in this case in C++20?? However, even that is the case, C++17 should still work.

#include<array>
auto f(std::array<int volatile,3> arr)
{
    decltype(arr) arr2;
    std::copy(arr.begin(),arr.end(),arr2.begin());
}

https://godbolt.org/z/Tn87r5
Comment 42 Jonathan Wakely 2020-03-03 09:46:10 UTC
(In reply to fdlbxtqi from comment #39)
> Hi Jonathan. Another bug I found before and you haven't fixed:

Have you reported it?

> https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/
> stl_algobase.h#L1706
> 
> const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
> 
> 
> You cannot call &*__first, &*__first2 for contiguous_iterator since a lot of
> implementations of contiguous iterators will check bounds at debugging mode
> (for example MSVC's STL vector implementation). Yours will break code.

How? The iterators are only dereferenced when __len is non-zero.

And how would MSVC's vector be used with libstdc++ algorithms?

Do you have a testcase demonstrating an actual problem?

> __second is not allowed to dereference, that is UB.

What is __second?
Comment 43 Jonathan Wakely 2020-03-03 09:57:28 UTC
(In reply to fdlbxtqi from comment #39)
> const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;

Are you mistakenly reading this as dereferencing &*begin and &*end of an iterator range? Because that's not what it does. It dereferences the begin iterator of two separate non-empty ranges.

There is no past-the-end iterator in this statement.
Comment 44 Jonathan Wakely 2020-03-03 10:13:49 UTC
(In reply to fdlbxtqi from comment #41)
> And the volatile bugs haven't fixed either.

Maybe you shouldn't use a single bug report for multiple, unrelated problems.

I've reported Bug 94013 for that because it's completely unrelated to this bug.
Comment 45 fdlbxtqi 2020-03-03 10:16:04 UTC
(In reply to Jonathan Wakely from comment #43)
> (In reply to fdlbxtqi from comment #39)
> > const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0;
> 
> Are you mistakenly reading this as dereferencing &*begin and &*end of an
> iterator range? Because that's not what it does. It dereferences the begin
> iterator of two separate non-empty ranges.
> 
> There is no past-the-end iterator in this statement.

But I still think &* is not a good idea tbh. Better just call to_address(__first) to avoid potential issues of a contiguous iterator.
Comment 46 GCC Commits 2020-09-02 14:51:00 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:2f983fa69005b603ea1758a013b4134d5b0f24a8

commit r11-2981-g2f983fa69005b603ea1758a013b4134d5b0f24a8
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Sep 2 15:17:24 2020 +0100

    libstdc++: Fix three-way comparison for std::array [PR 96851]
    
    The spaceship operator for std::array uses memcmp when the
    __is_byte<value_type> trait is true, but memcmp isn't usable in
    constexpr contexts. Also, memcmp should only be used for unsigned byte
    types, because it gives the wrong answer for signed chars with negative
    values.
    
    We can simply check std::is_constant_evaluated() so that we don't use
    memcmp during constant evaluation.
    
    To fix the problem of using memcmp for inappropriate types, this patch
    adds new __is_memcmp_ordered and __is_memcmp_ordered_with traits. These
    say whether using memcmp will give the right answer for ordering
    operations such as lexicographical_compare and three-way comparisons.
    The new traits can be used in several places, and can also be used to
    implement my suggestion in PR 93059 comment 37 to use memcmp for
    unsigned integers larger than one byte on big endian targets.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/96851
            * include/bits/cpp_type_traits.h (__is_memcmp_ordered):
            New trait that says if memcmp can be used for ordering.
            (__is_memcmp_ordered_with): Likewise, for two types.
            * include/bits/deque.tcc (__lex_cmp_dit): Use new traits
            instead of __is_byte and __numeric_traits.
            (__lexicographical_compare_aux1): Likewise.
            * include/bits/ranges_algo.h (__lexicographical_compare_fn):
            Likewise.
            * include/bits/stl_algobase.h (__lexicographical_compare_aux1)
            (__is_byte_iter): Likewise.
            * include/std/array (operator<=>): Likewise. Only use memcmp
            when std::is_constant_evaluated() is false.
            * testsuite/23_containers/array/comparison_operators/96851.cc:
            New test.
            * testsuite/23_containers/array/tuple_interface/get_neg.cc:
            Adjust dg-error line numbers.
Comment 47 Jakub Jelinek 2021-04-27 11:38:29 UTC
GCC 11.1 has been released, retargeting bugs to GCC 11.2.
Comment 48 m.cencora 2021-10-06 10:40:11 UTC
Any progress on this?

I have stumbled upon same inefficiencies when writing serialization code that copies raw bytes to unsigned char* raw memory buffer from contiguous containers (string/vector/array) with elements of any 1 byte trivially copyable type (signed char, char, enum, UDT).
Comment 49 Jonathan Wakely 2024-10-11 11:57:43 UTC
Patches posted:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665055.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665056.html
More to follow.

_Z15copy_char_arrayPDuRKSt5arrayIcLm2EE:
.LFB5167:
	.cfi_startproc
	movzwl	(%rsi), %eax
	movw	%ax, (%rdi)
	leaq	2(%rdi), %rax
	ret
	.cfi_endproc
Comment 50 GCC Commits 2024-10-14 09:39:20 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:308d19c11e119b2c5abf67778dd0ac8a370e5df7

commit r15-4320-g308d19c11e119b2c5abf67778dd0ac8a370e5df7
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Oct 10 13:36:33 2024 +0100

    libstdc++: Enable memcpy optimizations for distinct integral types [PR93059]
    
    Currently we only optimize std::copy, std::copy_n etc. to memmove when
    the source and destination types are the same. This means that we fail
    to optimize copying between distinct 1-byte types, e.g. copying from a
    buffer of unsigned char to a buffer of char8_t or vice versa.
    
    This patch adds more partial specializations of the __memcpyable trait
    so that we allow memcpy between integers of equal widths. This will
    enable memmove for copies between narrow character types and also
    between same-width types like int and unsigned.
    
    Enabling the optimization needs to be based on the width of the integer
    type, not just the size in bytes. This is because some targets define
    non-standard integral types such as __int20 in msp430, which has padding
    bits. It would not be safe to memcpy between e.g. __int20 and int32_t,
    even though sizeof(__int20) == sizeof(int32_t). A new trait is
    introduced to define the width, __memcpyable_integer, and then the
    __memcpyable trait compares the widths.
    
    It's safe to copy between signed and unsigned integers of the same
    width, because GCC only supports two's complement integers.
    
    I initially though it would be useful to define the specialization
    __memcpyable_integer<byte> to enable copying between narrow character
    types and std::byte. But that isn't possible with std::copy, because
    is_assignable<char&, std::byte> is false. Optimized copies using memmove
    will already happen for copying std::byte to std::byte, because
    __memcpyable<T*, T*> is true.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/93059
            * include/bits/cpp_type_traits.h (__memcpyable): Add partial
            specialization for pointers to distinct types.
            (__memcpyable_integer): New trait to control which types can use
            cross-type memcpy optimizations.
Comment 51 GCC Commits 2024-10-14 09:39:25 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:d8ef4471cb9c9f86784b62424a215ea42173bfe1

commit r15-4321-gd8ef4471cb9c9f86784b62424a215ea42173bfe1
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Oct 10 13:36:33 2024 +0100

    libstdc++: Enable memset optimizations for distinct character types [PR93059]
    
    Currently we only optimize std::fill to memset when the source and
    destination types are the same byte-sized type. This means that we fail
    to optimize cases like std::fill(buf. buf+n, 0) because the literal 0 is
    not the same type as the character buffer.
    
    Such cases can safely be optimized to use memset, because assigning an
    int (or other integer) to a narrow character type has the same effects
    as converting the integer to unsigned char then copying it with memset.
    
    This patch enables the optimized code path when the fill character is a
    memcpy-able integer (using the new __memcpyable_integer trait). We still
    need to check is_same<U, T> to enable the memset optimization for
    filling a range of std::byte with a std::byte value, because that isn't
    a memcpyable integer.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/93059
            * include/bits/stl_algobase.h (__fill_a1(T*, T*, const T&)):
            Change template parameters and enable_if condition to allow the
            fill value to be an integer.
Comment 52 Jonathan Wakely 2024-10-14 10:08:23 UTC
Fixed for GCC 15
Comment 53 GCC Commits 2024-10-18 13:50:36 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:3abe751ea86e3472fa2c97bf2014f9f93f569019

commit r15-4473-g3abe751ea86e3472fa2c97bf2014f9f93f569019
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Oct 9 12:55:54 2024 +0100

    libstdc++: Refactor std::uninitialized_{copy,fill,fill_n} algos [PR68350]
    
    This refactors the std::uninitialized_copy, std::uninitialized_fill and
    std::uninitialized_fill_n algorithms to directly perform memcpy/memset
    optimizations instead of dispatching to std::copy/std::fill/std::fill_n.
    
    The reasons for this are:
    
    - Use 'if constexpr' to simplify and optimize compilation throughput, so
      dispatching to specialized class templates is only needed for C++98
      mode.
    - Use memcpy instead of memmove, because the conditions on
      non-overlapping ranges are stronger for std::uninitialized_copy than
      for std::copy. Using memcpy might be a minor optimization.
    - No special case for creating a range of one element, which std::copy
      needs to deal with (see PR libstdc++/108846). The uninitialized algos
      create new objects, which reuses storage and is allowed to clobber
      tail padding.
    - Relax the conditions for using memcpy/memset, because the C++20 rules
      on implicit-lifetime types mean that we can rely on memcpy to begin
      lifetimes of trivially copyable types.  We don't need to require
      trivially default constructible, so don't need to limit the
      optimization to trivial types. See PR 68350 for more details.
    - Remove the dependency on std::copy and std::fill. This should mean
      that stl_uninitialized.h no longer needs to include all of
      stl_algobase.h.  This isn't quite true yet, because we still use
      std::fill in __uninitialized_default and still use std::fill_n in
      __uninitialized_default_n. That will be fixed later.
    
    Several tests need changes to the diagnostics matched by dg-error
    because we no longer use the __constructible() function that had a
    static assert in. Now we just get straightforward errors for attempting
    to use a deleted constructor.
    
    Two tests needed more signficant changes to the actual expected results
    of executing the tests, because they were checking for old behaviour
    which was incorrect according to the standard.
    20_util/specialized_algorithms/uninitialized_copy/64476.cc was expecting
    std::copy to be used for a call to std::uninitialized_copy involving two
    trivially copyable types. That was incorrect behaviour, because a
    non-trivial constructor should have been used, but using std::copy used
    trivial default initialization followed by assignment.
    20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc was testing
    the behaviour with a non-integral Size passed to uninitialized_fill_n,
    but I wrote the test looking at the requirements of uninitialized_copy_n
    which are not the same as uninitialized_fill_n. The former uses --n and
    tests n > 0, but the latter just tests n-- (which will never be false
    for a floating-point value with a fractional part).
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/68350
            PR libstdc++/93059
            * include/bits/stl_uninitialized.h (__check_constructible)
            (_GLIBCXX_USE_ASSIGN_FOR_INIT): Remove.
            [C++98] (__unwrappable_niter): New trait.
            (__uninitialized_copy<true>): Replace use of std::copy.
            (uninitialized_copy): Fix Doxygen comments. Open-code memcpy
            optimization for C++11 and later.
            (__uninitialized_fill<true>): Replace use of std::fill.
            (uninitialized_fill): Fix Doxygen comments. Open-code memset
            optimization for C++11 and later.
            (__uninitialized_fill_n<true>): Replace use of std::fill_n.
            (uninitialized_fill_n): Fix Doxygen comments. Open-code memset
            optimization for C++11 and later.
            * testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc:
            Adjust expected behaviour to match what the standard specifies.
            * testsuite/20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc:
            Likewise.
            * testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
            Adjust dg-error directives.
            * testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc:
            Likewise.
            * testsuite/20_util/specialized_algorithms/uninitialized_copy_n/89164.cc:
            Likewise.
            * testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc:
            Likewise.
            * testsuite/20_util/specialized_algorithms/uninitialized_fill_n/89164.cc:
            Likewise.
            * testsuite/23_containers/vector/cons/89164.cc: Likewise.
            * testsuite/23_containers/vector/cons/89164_c++17.cc: Likewise.
    
    Reviewed-by: Patrick Palka <ppalka@redhat.com>
Comment 54 GCC Commits 2025-02-25 22:35:12 UTC
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:2256e30874af2ef804bb19d2eba40f9c92953beb

commit r15-7706-g2256e30874af2ef804bb19d2eba40f9c92953beb
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Feb 25 18:06:46 2025 +0000

    libstdc++: Fix typo in std::fill SFINAE constraint [PR93059]
    
    The r15-4321-gd8ef4471cb9c9f change incorrectly used __value as the
    member of the __memcpyable_integer trait, but it should have been
    __width. That meant this overload was not being used for _Tp != _Up.
    
    Also return after doing the loop for the consteval case. The missing
    return wasn't causing incorrect behaviour because the consteval loop
    increments the iterator until it equals the end of the range, so the
    memset isn't done.  But it's still better to return and not even try
    to do the memset.
    
    libstdc++-v3/ChangeLog:
    
            PR libstdc++/93059
            * include/bits/stl_algobase.h (__fill_a1): Fix typo in SFINAE
            constraint.