[Bug libstdc++/85824] regex constructor crashes under UTF-8 locale on Solaris SPARC when parsing a simple character class
redi at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri May 18 12:06:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85824
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |timshen at gcc dot gnu.org
--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Wanying Luo from comment #0)
> When _M_transform() calls strxfrm() and gets -1 when converting 0x80 under
> the UTF-8 locale on Solaris SPARC, it simply assigns -1 to __res of type
> size_t which creates a very large number. This causes __ret.append(__c,
> __res) to crash. I think it would be nice if the code checks errno and
> issues a better error message than the one above.
N.B. it doesn't just crash, it throws an exception because it can't append
4294967295 bytes to a std::string. Any fix to check errno in
collate<char>::do_transform is still going to involve throwing an exception,
just a slightly different one.
The real problem is that std::regex wants to build a cache of every value from
CHAR_MIN to CHAR_MAX, to decide if it matches the bracket expression "[0-9]".
If calling strxfrm on any 8-bit char value produces an error then we're going
to get an exception. I think something in the regex compiler (maybe in
transform_primary) needs to handle those exceptions (and either decide the
characters that produce errors do not match, or maybe disable the cache?)
Tim, I'll take care of checking errno in collate<>::_M_transform but could you
advise what to do about the regex compiler? Maybe:
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -257,7 +257,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
const __ctype_type& __fctyp(use_facet<__ctype_type>(_M_locale));
std::vector<char_type> __s(__first, __last);
__fctyp.tolower(__s.data(), __s.data() + __s.size());
- return this->transform(__s.data(), __s.data() + __s.size());
+ __try {
+ return this->transform(__s.data(), __s.data() + __s.size());
+ } catch(const std::runtime_error&) {
+ return string_type();
+ }
}
/**
More information about the Gcc-bugs
mailing list