This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc compile-time (multibyte issue)
- From: dewar at gnat dot com (Robert Dewar)
- To: dewar at gnat dot com, zack at codesourcery dot com
- Cc: davem at redhat dot com, gcc at gcc dot gnu dot org
- Date: Sun, 19 May 2002 13:55:31 -0400 (EDT)
- Subject: Re: gcc compile-time (multibyte issue)
<<For the record, this or something very similar is what Neil and I have
planned to do all along. We never intended to call mbtowc() for every
character -- in fact, I at least do not intend to use the <wchar.h>
functions at all, because they are not nearly capable enough for GCC's
purposes (in my opinion).
>>
By the way, the actual algorithms for interpreting various encoding types
may be useful to look at in GNAT. The methods we currently accept are:
Hex ESC encoding (simply ESC followed by four hex digits)
Upper half encoding (used in China on PC's extensively)
Shift-JIS encoding (the most usual form used in Japan)
EUC encoding (the alternate form used in Japan)
UTF-8 (I don't think anyone uses this, but it's there :-)
Brackets encoding as in ["2345"]
The last method is useful for portable tests in that it allows wide
characters to be input in a form that is entirely in graphic characters
with no upper half or control characters.