This is the mail archive of the
mailing list for the GCC project.
Re: Query on UTF-32 encodings for letters
Robert Dewar wrote:
I hope Turkish programmers using
Ada will not get completely confused :-)
There are quite a few Turkish people around here. I can
confirm that they know how to work around missing dotless i and
dot-less I when hardware or software don't offer one.
It's a very deliberate decision too, since it is inconsistent
with the surrounding characters:
I just noticed that this character is treated specially in
CaseFolding.txt. Seems like a good catalyst character if you want
a complexity reaction :)
0130; F; 0069 0307; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; T; 0069; # LATIN CAPITAL LETTER I WITH DOT ABOVE
# The status field is:
# C: common case folding, common mappings shared by both simple and full mappings.
# F: full case folding, mappings that cause strings to grow in length. Multiple characters are separated by spaces.
# S: simple case folding, mappings to single characters where different from F.
# T: special case for uppercase I and dotted uppercase I
# - For non-Turkic languages, this mapping is normally not used.
# - For Turkic languages (tr, az), this mapping can be used instead of the normal mapping for these characters.
# Note that the Turkic mappings do not maintain canonical equivalence without additional processing.
# See the discussions of case mapping in the Unicode Standard for more information.
Bah! Makes me think even more that the whole business of extending case
insensitive letters to wide characters was a mistake. Oh well, I have got
it all implemented now :-)
I for one am looking forward to portable international Ada source code.