This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: cpplib: Preliminary implementation of UCNs
- From: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- To: Geoff Keating <geoffk at geoffk dot org>
- Cc: Neil Booth <neil at daikokuya dot co dot uk>, gcc-patches at gcc dot gnu dot org
- Date: Sun, 20 Apr 2003 13:26:18 +0100 (BST)
- Subject: Re: cpplib: Preliminary implementation of UCNs
- References: <20030419235913.GV23814@daikokuya.co.uk> <jmd6jhandr.fsf@desire.geoffk.org>
On Sun, 19 Apr 2003, Geoff Keating wrote:
> However, if someone does:
>
> #define STRING(x) #x
>
> printf ("%s %s %s", STRING(\u00c0), STRING(\u00C0), STRING(À));
>
> then by 6.10.3.2 paragraph 2, that should print
>
> \u00c0 \u00C0 À
>
> so some tracking is needed inside cpplib.
Backslashes outside string and character constants do not get escaped, so
this yields (for C)
printf ("%s %s %s", "\u00c0", "\u00C0", "À");
which prints
À À À
(if it had been STRING("\u00c0") etc., it would have been
implementation-defined whether the backslash was escaped.) For C++, À
would have become \u00c0 or \u00C0 or \U000000c0 or \U000000C0, but what
is printed would be the same. However, I'm not sure whether the
parenthetical remark in [lex.phases] about use of internal notations other
than the UCN is just repeating the as-if rule or means something more.
printf ("%s\n", STRING("$"));
would appear to be required to print
"$"
for C and
"\u0024"
or
"\U00000024"
for C++. (Some C9X drafts had a model similar to the C++ model of UCNs;
this problem (and another one) with STRING("$") was pointed out in the
public comment period, and the model changed.)
--
Joseph S. Myers
jsm28 at cam dot ac dot uk