This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Support for C++0x and C1x u8 string literals and raw string literals
- From: Jakub Jelinek <jakub at redhat dot com>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: Tom Tromey <tromey at redhat dot com>, Jason Merrill <jason at redhat dot com>, gcc-patches at gcc dot gnu dot org, Kris Van Hees <kris dot van dot hees at oracle dot com>, Ulrich Drepper <drepper at redhat dot com>
- Date: Fri, 12 Sep 2008 15:19:51 -0400
- Subject: Re: [PATCH] Support for C++0x and C1x u8 string literals and raw string literals
- References: <20080912132007.GA9666@hs20-bc2-1.build.redhat.com> <Pine.LNX.4.64.0809121541540.17535@digraph.polyomino.org.uk>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, Sep 12, 2008 at 03:56:38PM +0000, Joseph S. Myers wrote:
> > be really pedantic and accept only basic source charset character except
> > the listed 7, rather than say all characters except the listed 7
> > plus maybe disallowing '\0', as this is a new feature I think being
>
> > pedantic doesn't hurt. In one of the raw string papers floating
> > around there was an example using R"@[...]@" which is not pedantically
> > valid, as @ is not basic source charset character. u8 string
>
> But that example is conditionally valid in C++ only, although not in C,
> because in phase 1 @ will have been converted to a UCN (part of the
> existing C++98 semantics we don't implement). The validity is only
> conditional because there is no requirement to use the same UCN for each
> instance of @.
UCNs aren't valid in d-char-sequence though, only in normal strings and within
r-char-sequence.
raw-string:
" d-char-sequenceopt [ r-char-sequenceopt ] d-char-sequenceopt "
r-char-sequence:
r-char
r-char-sequence r-char
r-char:
any member of the source character set, except
(1), a backslash \followed by a u or U, or
(2), a right square bracket ] followed by the initial d-char-sequence
(which may be empty) followed by a double quote ".
universal-character-name
d-char-sequence:
d-char
d-char-sequence d-char
d-char:
any member of the basic source character set except:
space, the left square bracket [, the right square bracket ],
and the control characters representing horizontal tab,
vertical tab, form feed, and newline.
Jakub