This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Support for C++0x and C1x u8 string literals and raw string literals
On Fri, Sep 12, 2008 at 07:53:04PM +0000, Joseph S. Myers wrote:
> On Fri, 12 Sep 2008, Jakub Jelinek wrote:
>
> > UCNs aren't valid in d-char-sequence though, only in normal strings and within
> > r-char-sequence.
>
> However, backslash is valid in d-char-sequence, and so are all the other
> characters making up UCNs. The way I read N2723 is that in phase 1 each @
> chararcter is converted to either \u0040 or \U00000040, then in phase 3
> that sequence of characters may end up being interpreted as something
> other than a UCN. If a sequence matching UCN syntax is produced by
> deleting backslash-newline, that's undefined in C++ (unlike in C), but
> what's not mentioned as undefined is a UCN from stage 1 not being
> interpreted as a UCN in stage 3 - whether through being in a
> d-char-sequence or for any other reason. Certainly writing \u0040
> directly in a d-char-sequence would appear to be valid.
If it is up to the implementation to choose between \u0040 and \U00000040,
then writing R"@@[]@@"; would be either valid or invalid, depending
on whether the implementation has replaced it by \u0040 or \U00000040 (as in
the latter case it is 2 x 10 characters, more than 16 char limit for
d-char-sequence).
Jakub