Bug 67015 - "^[a-z0-9][a-z0-9-]*$", std::regex::extended is miscompiled
Summary: "^[a-z0-9][a-z0-9-]*$", std::regex::extended is miscompiled
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 5.2.1
: P3 normal
Target Milestone: 4.9.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-26 08:40 UTC by Matthias Klose
Modified: 2015-08-05 04:39 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 4.9.3, 5.2.1, 6.0
Last reconfirmed:


Attachments
regex.cc (446 bytes, text/x-csrc)
2015-07-26 08:40 UTC, Matthias Klose
Details
regex2.cc (431 bytes, text/x-csrc)
2015-07-26 08:41 UTC, Matthias Klose
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Klose 2015-07-26 08:40:20 UTC
Created attachment 36055 [details]
regex.cc

[forwarded from https://bugs.debian.org/778112]

This regex is failing:

  std::regex("^[a-z0-9][a-z0-9-]*$", std::regex::extended);

however this one works:

  std::regex("^[a-z0-9][-a-z0-9]*$", std::regex::extended); 

I'm unable to reproduce the claim in the original bug report with regex2.cc.

$ cat run.sh 
g++-4.8 -static-libstdc++ -std=c++0x regex.cc && ./a.out
g++-4.9 -static-libstdc++ -std=c++0x regex.cc && ./a.out
g++-5 -static-libstdc++ -std=c++0x regex.cc && ./a.out
/usr/lib/gcc-snapshot/bin/g++ -static-libstdc++ -std=c++0x regex.cc && ./a.out
clang++-3.5 -stdlib=libc++ -std=c++0x regex.cc && ./a.out
clang++-3.6 -stdlib=libc++ -std=c++0x regex.cc && ./a.out

g++-4.8 -static-libstdc++ -std=c++0x regex2.cc && ./a.out
g++-4.9 -static-libstdc++ -std=c++0x regex2.cc && ./a.out
g++-5 -static-libstdc++ -std=c++0x regex2.cc && ./a.out
/usr/lib/gcc-snapshot/bin/g++ -static-libstdc++ -std=c++0x regex2.cc && ./a.out
clang++-3.5 -stdlib=libc++ -std=c++0x regex2.cc && ./a.out
clang++-3.6 -stdlib=libc++ -std=c++0x regex2.cc && ./a.out

+ g++-4.8 -static-libstdc++ -std=c++0x regex.cc
+ ./a.out
E0: regex_error
a.out: regex.cc:24: int main(): Assertion `std::regex_match("test", debian_cron_namespace)' failed.
Aborted (core dumped)
+ g++-4.9 -static-libstdc++ -std=c++0x regex.cc
+ ./a.out
E1: regex_error
+ g++-5 -static-libstdc++ -std=c++0x regex.cc
+ ./a.out
E1: regex_error
+ /usr/lib/gcc-snapshot/bin/g++ -static-libstdc++ -std=c++0x regex.cc
+ ./a.out
E1: regex_error
+ clang++-3.5 -stdlib=libc++ -std=c++0x regex.cc
+ ./a.out
+ clang++-3.6 -stdlib=libc++ -std=c++0x regex.cc
+ ./a.out
+ g++-4.8 -static-libstdc++ -std=c++0x regex2.cc
+ ./a.out
E0: regex_error
E1: regex_error
E2: regex_error
+ g++-4.9 -static-libstdc++ -std=c++0x regex2.cc
+ ./a.out
+ g++-5 -static-libstdc++ -std=c++0x regex2.cc
+ ./a.out
+ /usr/lib/gcc-snapshot/bin/g++ -static-libstdc++ -std=c++0x regex2.cc
+ ./a.out
+ clang++-3.5 -stdlib=libc++ -std=c++0x regex2.cc
+ ./a.out
+ clang++-3.6 -stdlib=libc++ -std=c++0x regex2.cc
+ ./a.out
Comment 1 Matthias Klose 2015-07-26 08:41:48 UTC
Created attachment 36056 [details]
regex2.cc
Comment 2 Roger Leigh 2015-07-26 10:05:37 UTC
Note regex2.cc fails with 4.9.2 but not with 4.9.3 or 5.1, so this appear to have been fixed.
Comment 4 Tim Shen 2015-07-29 03:46:08 UTC
Author: timshen
Date: Wed Jul 29 03:45:35 2015
New Revision: 226336

URL: https://gcc.gnu.org/viewcvs?rev=226336&root=gcc&view=rev
Log:
	PR libstdc++/67015
	* include/bits/regex_compiler.h (_Compiler<>::_M_expression_term,
	_BracketMatcher<>::_M_add_collating_element): Change signature
	to make checking the and of bracket expression easier.
	* include/bits/regex_compiler.tcc (_Compiler<>::_M_expression_term):
	Treat '-' as a valid literal if it's at the end of bracket expression.
	* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
	New testcases.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/regex_compiler.h
    trunk/libstdc++-v3/include/bits/regex_compiler.tcc
    trunk/libstdc++-v3/testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc
Comment 5 Tim Shen 2015-07-29 04:30:57 UTC
Author: timshen
Date: Wed Jul 29 04:30:25 2015
New Revision: 226337

URL: https://gcc.gnu.org/viewcvs?rev=226337&root=gcc&view=rev
Log:
	Backport from mainline
	2015-07-29  Tim Shen  <timshen@google.com>

	PR libstdc++/67015
	* include/bits/regex_compiler.h (_Compiler<>::_M_expression_term,
	_BracketMatcher<>::_M_add_collating_element): Change signature
	to make checking the and of bracket expression easier.
	* include/bits/regex_compiler.tcc (_Compiler<>::_M_expression_term):
	Treat '-' as a valid literal if it's at the end of bracket expression.
	* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
	New testcases.

Modified:
    branches/gcc-5-branch/libstdc++-v3/ChangeLog
    branches/gcc-5-branch/libstdc++-v3/include/bits/regex_compiler.h
    branches/gcc-5-branch/libstdc++-v3/include/bits/regex_compiler.tcc
    branches/gcc-5-branch/libstdc++-v3/testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc
Comment 6 Tim Shen 2015-07-29 04:45:15 UTC
Fixed in trunk and gcc5.

Mark as fixed.
Comment 7 Ville Voutilainen 2015-07-29 09:34:43 UTC
I think it would be a good idea to backport this fix to 4.9 as well, that
version will be in use out in the wild for quite some time. I'm not hell-bent
on having such a backport, I just think it would seem reasonable.
Comment 8 Tim Shen 2015-07-29 21:26:37 UTC
(In reply to Ville Voutilainen from comment #7)
> I think it would be a good idea to backport this fix to 4.9 as well, that
> version will be in use out in the wild for quite some time. I'm not hell-bent
> on having such a backport, I just think it would seem reasonable.

Well we've decided not to do so... https://gcc.gnu.org/ml/libstdc++/2015-07/msg00083.html

But to be honest I have no idea of the user side. So I'll leave this to Jonathan. BTW, backporting takes trivial effort.
Comment 9 Ville Voutilainen 2015-07-29 23:58:45 UTC
(In reply to Tim Shen from comment #8)
> Well we've decided not to do so...
> https://gcc.gnu.org/ml/libstdc++/2015-07/msg00083.html

Yes, I read that, that's why I commented on this bug, I'd like to
change Jonathan's mind about it. :)
Comment 10 Alex Turbov 2015-07-31 22:31:33 UTC
Few more links to think about backporting to 4.9:

https://bugs.gentoo.org/show_bug.cgi?id=555648
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778112
Comment 11 Jonathan Wakely 2015-08-04 08:57:37 UTC
OK, let's backport to 4.9 too.
Comment 12 Tim Shen 2015-08-05 04:39:55 UTC
Author: timshen
Date: Wed Aug  5 04:39:23 2015
New Revision: 226607

URL: https://gcc.gnu.org/viewcvs?rev=226607&root=gcc&view=rev
Log:
	Backported from mainline
	2015-07-29  Tim Shen  <timshen@google.com>

	PR libstdc++/67015
	* include/bits/regex_compiler.h (_Compiler<>::_M_expression_term,
	_BracketMatcher<>::_M_add_collating_element): Change signature
	to make checking the and of bracket expression easier.
	* include/bits/regex_compiler.tcc (_Compiler<>::_M_expression_term):
	Treat '-' as a valid literal if it's at the end of bracket expression.
	* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
	New testcases.

Modified:
    branches/gcc-4_9-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_9-branch/libstdc++-v3/include/bits/regex_compiler.h
    branches/gcc-4_9-branch/libstdc++-v3/include/bits/regex_compiler.tcc
    branches/gcc-4_9-branch/libstdc++-v3/testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc