This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Old UTF16 patch
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Elena Zannoni <elena dot zannoni at oracle dot com>
- Cc: gcc at gcc dot gnu dot org, Tom Tromey <tromey at redhat dot com>
- Date: Fri, 2 Nov 2007 00:08:26 +0000 (UTC)
- Subject: Re: Old UTF16 patch
- References: <472A61CC.2050609@oracle.com>
I haven't followed any developments relating to TR19769 in WG14 after its
publication in detail; has WG14 yet given an answer on what should be done
with u'C' where C represents a single character that requires a surrogate
pair to represent in UTF-16 (to name one noted place where the TR
underspecifies things)?
I don't think there's much worthwhile in those old patches. Start with
the ISO TR text, produce testcases that cover everything there and the
desired semantics for everything the TR leaves unspecified or
underspecified, and only once the testcases are settled work out an
implementation for the agreed semantics.
A TR is not a standard, so for C this must be disabled in all strict
conformance modes (note that it affects the rules for lexing and so
changes the semantics of conforming programs); likewise for C++98. The
C++0x draft includes the notation from TR19769, so the feature should be
enabled by default in C++0x (and so far as the C TR is compatible with
C++0x, both should be followed in both C and C++ when the feature is
enabled).
--
Joseph S. Myers
joseph@codesourcery.com