This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Query on UTF-32 encodings for letters
Florian Weimer wrote:
Yes, and that's fine, both lower case i with dot and lower case i
without dot fold upper case to capital I (without dot), and so all three
are equivalent in identifiers.
No, this is not the way Turkish case conversion works. Turkish has a
rule LATIN SMALL LETTER I -> LATIN CAPITAL LETTER I WITH DOT ABOVE
(U+0130).
Maybe not, but I am implementing Ada, and not Turkish :-)
And the Ada rules map as I quoted. Ours not to reason why ....
I guess the point is that since we know that latin small letter i
must map to latin capital letter i (with no dot) in Ada (because
obviously that's reasonable and we cannot have case conversion in
identifiers be locale dependent. When it comes to the dotless I,
it would indeed be bizarre to map it to a dotted capital I, so they
end up being mapped the same. Makes sense, given the requirement
that case conversion (or more basically program legality) be
locale independent.