This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Optimize std::sub_match comparisons using string_view-like type


Avoid creation of unnecessary basic_string objects by using a simplified
string_view type and performing comparisons on that type instead. A
temporary basic_string object is still used when the sub_match's
iterators are not contiguous, in order to get an object that the
__string_view can reference.

	* include/bits/regex.h (sub_match::operator string_type): Call str().
	(sub_match::compare): Use _M_str() instead of str().
	(sub_match::_M_compare): New public function.
	(sub_match::__string_view): New helper type.
	(sub_match::_M_str): New overloaded functions to avoid creating a
	string_type object when not needed.
	(operator==, operator!=, operator<, operator>, operator<=, operator>=):
	Use sub_match::_M_compare instead of creating string_type objects.
	Fix Doxygen comments.
	* include/bits/regex_compiler.h (__has_contiguous_iter): Remove.
	(__is_contiguous_normal_iter): Rename to __is_contiguous_iter and
	simplify.
	(__enable_if_contiguous_iter, __disable_if_contiguous_iter): Use
	__enable_if_t.
	* include/std/type_traits (__enable_if_t): Define for C++11.
	* testsuite/28_regex/sub_match/compare.cc: New.
	* testsuite/util/testsuite_iterators.h (remove_cv): Add transformation
	trait.
	(input_iterator_wrapper): Use remove_cv for value_type argument of
	std::iterator base class.

We could avoid temporary string_type objects even for non-contiguous
iterators by writing a manual "compare" function that uses
char_traits::lt on every element in the iterator ranges. It's not
obvious that would actually be an optimization (and I haven't measured
it). For sub-expression matches that fit in an SSO buffer creating a
string doesn't allocate, and once we have a std::string or
std::wstring we benefit from highly optimized strncmp or wcsncmp
implementations in glibc. Maybe we could use an adaptive approach
where we create a string when value_type is char or wchar_t, and
_BiIter is random-access, and std:distance(first, second) <= SSO size.
Otherwise we'd use a manual loop using char_traits::lt. I don't plan
to work on that, the common cases are all likely to be using pointers
or basic_string::const_iterator and so already optimized by this
patch.

Tested powerpc64le-linux, committed to trunk.


Attachment: patch.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]