This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Query on UTF-32 encodings for letters
Paul Koning wrote:
But that is nowhere near sufficient. The issue is that case folding
rules are different for different languages/locales that use the SAME
character set. For example, there are a whole bunch of different
folding rules for Latin-1.
Well in practice the folding rules for Latin-1 have been part of the
standard for ten years, so they are not about to change.
It would be interesting to know an example of what you state above.
Certainly people have been using Latin-1 to write Ada in countries
all over the world, and no one has ever found the folding rules
for identifiers to be in any way inconvenient.
There was a point in the discussion early on when JDI wanted upper
case E and lower case E-acute to match in identif
The decision in Ada is that you do not want the meaning of a program
or its legality to change in a locale dependent way. This is really
a fundamental starting point. Note that this is a radically different
issue from folding at run-time in a manner that makes sense to an
application program.
If 10646 defines a single set of rules, then it's part of the problem,
not part of the solution.
Well the 10646 definition provides a framework from which an acceptable
locale-independent set of folding rules can be obtained. Note that acceptable
here means acceptable to at least the ISO P-members. Indeed when it comes
to such issues in the Ada standard, this is an area where the non-english
speaking member countries take the lead.