This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Query on UTF-32 encodings for letters


>>>>> "Robert" == Robert Dewar <dewar@adacore.com> writes:

 Robert> Paul Koning wrote:
 >> But that is nowhere near sufficient.  The issue is that case
 >> folding rules are different for different languages/locales that
 >> use the SAME character set.  For example, there are a whole bunch
 >> of different folding rules for Latin-1.

 Robert> Well in practice the folding rules for Latin-1 have been part
 Robert> of the standard for ten years, so they are not about to
 Robert> change.

 Robert> It would be interesting to know an example of what you state
 Robert> above.  

Uppercase letters aren't accented in France, but they are in Quebec.
(That doesn't affect folding to lowercase, of course, but it does
affect case-insensitive equality).

An example that affects folding to lowercase: I folds to i-without-dot
in Turkish.  Those aren't in Latin-1, but they are in the Latin
section of 10646.

 Robert> The decision in Ada is that you do not want the meaning of a
 Robert> program or its legality to change in a locale dependent
 Robert> way. This is really a fundamental starting point. Note that
 Robert> this is a radically different issue from folding at run-time
 Robert> in a manner that makes sense to an application program.

Ok, fair enough, I was thinking more of the runtime case in my
comments.

	paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]