Re: Universal Character Names, v2

Zack Weinberg <> writes:

> ... which I disagree with.  I am rejecting this patch until you
> implement support for Unicode as she is spoke, which means UAX#15
> including normalization, not whatever nonsense is in the C and C++
> standards.

Can you elaborate why you consider this approach technically

I have just implemented normalization for Python, and I can tell you
that you will need a significant database, and completion of such an
implementation will take me several weeks.

Apart from the implementation difficulties, I see the following
problems with this requirement:

1. It is underspecified, as UAX#15 leaves a number of alternatives for
   language designers:
   a) which Unicode version?
   b) which normalization form?

2. It extends the languages, by allowing identifiers which must be
   rejected in a conforming implementation. Can you propose an
   implementation strategy that allows proper implementation of the
   -pedantic option in this case?

3. It restricts the languages, by disallowing identifiers that are
   allowed in the language definition.

4. It modifies the languages, by treating identifiers as equal which
   are not to be treated equal in the language definition.


