This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Universal Character Names, v2
- From: martin at v dot loewis dot de (Martin v. Löwis)
- To: Geoff Keating <geoffk at geoffk dot org>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: 01 Dec 2002 11:01:04 +0100
- Subject: Re: Universal Character Names, v2
- References: <200211282334.gASNYdTA004058@mira.informatik.hu-berlin.de><jmel93ikep.fsf@desire.geoffk.org><m33cpjqrji.fsf@mira.informatik.hu-berlin.de><200211301718.gAUHIQn30673@desire.geoffk.org><m3el92r5n8.fsf@mira.informatik.hu-berlin.de><200212010742.gB17g0R31091@desire.geoffk.org>
Geoff Keating <geoffk@geoffk.org> writes:
> It looks to me that a sequence like
>
> \u0660a
>
> should actually be an identifier during preprocessing, despite not
> being a valid identifier in translation phase 7, because you can paste
> it with some other identifier to get something that _is_ valid. I
> think this might be a small defect in the standard; at the least, it's
> not clear.
I see. If you extrapolate from ASCII, you probably should declare
\u0660a to be a pp-number (just like 5a), i.e. the production digit:
should allow UCNs that denote digits (in addition to [0-9]).
It then seems that my implementation is more restrictive than it
perhaps should be (but as restrictive as C99 itself); I'd like to
leave the implementation as-is, until a defect in C99 is resolved or a
user requires this as an extension.
Regards,
Martin