This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Query on UTF-32 encodings for letters


On Tue, 11 Jan 2005, Robert Dewar wrote:

> Ada 2005 requires full support for all planes of UTF-32
> encoding, including the use of letters in identifiers,
> including also proper upper lower case equivalence.
> 
> All this information is obtainable from the 10646 standard,
> but it is non-trivial to generate the predicates Is_Letter,
> and the function To_Lower.

Proper case folding and caseless matching are locale-dependent.  Case 
conversion can also depend on context in a word as well as on locale.  In 
Unicode there is titlecase as well as uppercase and lowercase.  I presume 
there is in fact a more precise specification, with appropriate normative 
references, of what exactly is required and whether there is to be 
locale-dependence, at compile time or at runtime.

Although the Unicode Character Database includes various tables for case 
mapping, including context and locale dependent mapping, I'm not sure 
whether these are normative or informative; section 4.2 of the Unicode 
Standard version 4.0 refers to them as normative, while section 5.18 says 
that case itself is normative but the mappings are informative: but the 
whole of chapter 5 is not normative.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]