Bug 58576 - std::regex_match() reports mismatched braces on a valid regex
Summary: std::regex_match() reports mismatched braces on a valid regex
Status: RESOLVED DUPLICATE of bug 53631
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.8.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-30 03:06 UTC by Galen G Brownsmith
Modified: 2013-10-01 15:26 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Archive containing the g++ -v -save-temps compile log, the generated .ii file and the original .cpp with the minimum-to-reproduce test case. (210.10 KB, application/x-gzip)
2013-09-30 03:06 UTC, Galen G Brownsmith
Details
re-uploading my tar.gz -- fixed an independent but potentially distracting typo. (210.09 KB, application/octet-stream)
2013-09-30 03:14 UTC, Galen G Brownsmith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Galen G Brownsmith 2013-09-30 03:06:03 UTC
Created attachment 30929 [details]
Archive containing the g++ -v -save-temps compile log, the generated .ii file and the original .cpp with the minimum-to-reproduce test case.

I attempted to use a regex to validate qualified hostnames.

However, when I used the regex from this thread ( http://stackoverflow.com/questions/1418423/the-hostname-regex ), with or without replacing the [0-9A-Za-z] with [:alnum:] (and properly escaping the backslashes), I get a regex_error exception thrown on std::regex_match() call, with a regex_constants::error_brack as the reported code().


Using: An unmodified copy of gcc 4.8.1 20130603 from the Fedora 19 primary repository (rpm ver: 4.8.1-1.fc19 )

(I use 4-spaces-per-tab in my source code, which isn't relevant for the code, but might make hand-tracing of Parens, Braces, and Brackets I did in comments more understandable.) 

(And, yes, I know replacing [0-9A-Za-z] with [:alnum:] isn't a legitimate change WRT domain name validity, unless I force a 'C' locale.  It was just easier to read when hunting down this issue.)
Comment 1 Galen G Brownsmith 2013-09-30 03:14:31 UTC
Created attachment 30930 [details]
re-uploading my tar.gz -- fixed an independent but potentially distracting typo.

There was a case where part of the regex read "(?(?:" rather than "(?:(?:".  Fixed that, behvior remains.
Comment 2 Andreas Schwab 2013-09-30 07:40:28 UTC
[:alnum:] only matches the six characters ":almnu".  If you want to match any letter or digit you have to write [[:alnum:]].
Comment 3 Jonathan Wakely 2013-09-30 09:46:00 UTC
<regex> doesn't work in GCC 4.8

Your test works with the GCC trunk, where the work to implement regex is being done.

*** This bug has been marked as a duplicate of bug 53631 ***
Comment 4 Tim Shen 2013-10-01 15:26:52 UTC
Author: timshen
Date: Tue Oct  1 15:26:50 2013
New Revision: 203067

URL: http://gcc.gnu.org/viewcvs?rev=203067&root=gcc&view=rev
Log:
2013-10-01  Tim Shen  <timshen91@gmail.com>

	PR libstdc++/58576
	* include/bits/regex_automaton.tcc (_NFA<>::_M_eliminate_dummy)
	(_StateSeq<>::_M_clone): Add _S_opcode_subexpr_lookahead branch.
	* testsuite/28_regex/algorithms/regex_match/ecma/char/58576.cc: New.

Added:
    trunk/libstdc++-v3/testsuite/28_regex/algorithms/regex_match/ecma/char/58576.cc
Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/regex_automaton.tcc