This is the mail archive of the
mailing list for the GCC project.
Re: Query on UTF-32 encodings for letters
Paul Koning wrote:
Robert> Joseph S. Myers wrote:
>> Proper case folding and caseless matching are locale-dependent.
Robert> That's not true for the Ada 2005 rules, which are locale
Robert> independent and driven only by the 10646 database.
Then that simply means that Ada has either created a locale of its
own, or adopted one specific locale to be the one it uses.
Anglocentrism at work, perhaps?
I don't think that is the case, with the full 10646 database,
every character in the database is properly categorized, and
the whole point of Wide_Wide_Character in Ada is to match the
10646 standard exactly. That is what ISO mandates, so it is
hardly a matter of Anglocentrism (note that any reference to
Unicode as a standard *is* Anglocentric :-) We are driven by
ISO 10646, not Unicode. Luckily these are essentially
completely aligned at this stage.
Note that in 10646, there is a lot of distinction between
different national characters. For instance, the Greek upper
case alpha is typographically identical to latin upper case
A, but they occupy distinct code positions. That means that
the folding rule for every character is part of the
non-locale dependent database.
Robert> (this character stuff is a bottomless pit :-)
It sure is.