This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Query on UTF-32 encodings for letters


Florian Weimer wrote:

Yes, and that's fine, both lower case i with dot and lower case i
without dot fold upper case to capital I (without dot), and so all three
are equivalent in identifiers.
No, this is not the way Turkish case conversion works. Turkish has a
rule LATIN SMALL LETTER I -> LATIN CAPITAL LETTER I WITH DOT ABOVE
(U+0130).

Maybe not, but I am implementing Ada, and not Turkish :-) And the Ada rules map as I quoted. Ours not to reason why ....

I guess the point is that since we know that latin small letter i
must map to latin capital letter i (with no dot) in Ada (because
obviously that's reasonable and we cannot have case conversion in
identifiers be locale dependent. When it comes to the dotless I,
it would indeed be bizarre to map it to a dotted capital I, so they
end up being mapped the same. Makes sense, given the requirement
that case conversion (or more basically program legality) be
locale independent.





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]