This is the mail archive of the
mailing list for the GCC project.
Re: gcc compile-time (multibyte issue)
- From: dewar at gnat dot com (Robert Dewar)
- To: dewar at gnat dot com, zack at codesourcery dot com
- Cc: davem at redhat dot com, gcc at gcc dot gnu dot org
- Date: Sun, 19 May 2002 13:55:31 -0400 (EDT)
- Subject: Re: gcc compile-time (multibyte issue)
<<For the record, this or something very similar is what Neil and I have
planned to do all along. We never intended to call mbtowc() for every
character -- in fact, I at least do not intend to use the <wchar.h>
functions at all, because they are not nearly capable enough for GCC's
purposes (in my opinion).
By the way, the actual algorithms for interpreting various encoding types
may be useful to look at in GNAT. The methods we currently accept are:
Hex ESC encoding (simply ESC followed by four hex digits)
Upper half encoding (used in China on PC's extensively)
Shift-JIS encoding (the most usual form used in Japan)
EUC encoding (the alternate form used in Japan)
UTF-8 (I don't think anyone uses this, but it's there :-)
Brackets encoding as in ["2345"]
The last method is useful for portable tests in that it allows wide
characters to be input in a form that is entirely in graphic characters
with no upper half or control characters.