Universal Character Names, v2
Thu Nov 28 20:35:00 GMT 2002
"Martin v. Löwis" <firstname.lastname@example.org> writes:
> Specifically, the changes relative to the previous patch are:
> - Update character sets for C99, and C++ DR 131.
> - Support escaped newlines in the middle of an UCN. This is done
> through the addition of maybe_read_ucs_reader function, which
> uses get_effective_char internally.
> - Support UCNs in numbers. In the internal represantation, such
> a number still has the UCN in it, i.e. no conversion to UTF-8
> takes place. Such numbers will only be valid if they are pasted
> with an identifier.
> - Support pasting of names that have UCNs in them. For that,
> cpp_spell_token had to be updated.
> - Check for assembler UTF-8 support, and reject UCNs if no such
> support is available. As a side effect, gcj will automatically
> use UTF-8 mangling where g++ supports UCNs.
Good so far...
> I have considered the following comments, but chose to take a
> different approach:
> - I have not put the test function for characters in libiberty.
> It is quite specific to C and C++, and only ever used in the
This is only true because of your second decision...
> - I have not decided to deviate from the C and C++ standards for
> character tests. Reviewers commented that they dislike the approach
> taken by the standards committees, and that the relevant Unicode
> specification should be taken into account instead. I disagree, as I
> consider the approach of giving explicit lists quite reasonable.
> More importantly, I think that standards conformance should be
> valued quite highly unless specific user demands require to
> ignore or extend the standards; this is not the case in the
> specific issue.
... which I disagree with. I am rejecting this patch until you
implement support for Unicode as she is spoke, which means UAX#15
including normalization, not whatever nonsense is in the C and C++
More information about the Java