This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Implementing Universal Character Names in identifiers
- From: Neil Booth <neil at daikokuya dot co dot uk>
- To: Zack Weinberg <zack at codesourcery dot com>
- Cc: "Martin v. L?wis" <loewis at informatik dot hu-berlin dot de>,gcc-patches at gcc dot gnu dot org, java at gcc dot gnu dot org
- Date: Sun, 10 Nov 2002 18:39:19 +0000
- Subject: Re: Implementing Universal Character Names in identifiers
- References: <200210280715.g9S7FdI2003815@paros.informatik.hu-berlin.de> <20021028075111.GB1273@codesourcery.com> <j4wuo39c6o.fsf@informatik.hu-berlin.de> <20021028183910.GC24090@codesourcery.com>
Zack Weinberg wrote:-
> Ugh. IMO, this is a defect in both standards - they should simply
> reference UAX15a7 and be done with it. It's been around since 1998,
> so they don't really have an excuse for not using it.
>
> I suggest:
>
> - In libiberty, provide interfaces that implement UAX15. On
> reflection, this should be a new <unicode.h> interface set, not
> tacked onto <safe-ctype.h>.
>
> - In cpplib, provide routines that validate individual identifiers
> against the precise lists in C99 and C++98.
Martin,
Are you going to do this part? It would be a good start. We could
do with a function that confirms whether a number is in the ranges
specified by the standards (separating the two if necessary, although
IMO that is pedantry in extremis). We also need something like
ucs_digit_p(), since a UCS digit cannot start an identifier (something
I think you missed in your patch).
I've got something reasonable for the lexer I think; the best thing is
not to force it to use maybe_read_ucs(). I'm still waiting for
assignment issues to be resolved with my new employer, unfortunately.
Neil.