This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Query on UTF-32 encodings for letters


>>>>> "Robert" == Robert Dewar <dewar@adacore.com> writes:

 >> Uppercase letters aren't accented in France
 Robert> That is a (very commonly held) myth.

Interesting.  Learn something new every day.

 >> An example that affects folding to lowercase: I folds to
 >> i-without-dot in Turkish.  Those aren't in Latin-1, but they are
 >> in the Latin section of 10646.

 Robert> Yes, but for Ada, we can consider identifier matching to be
 Robert> only in the mode of folding to upper case, which takes care
 Robert> of the dotless i since this folds to upper case I.

Then take i, which upcases to I with dot.  Turkish has i with and
without dot, and the dot is preserved when you change case (in either
direction). 

Would you map eszet (in German) to ss?  Or to sz?  Or neither?  Modern
usage does the former; 1930-ish usage the latter.

	    paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]