This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: gcc compile-time (multibyte issue)

From: dewar at gnat dot com (Robert Dewar)
To: dewar at gnat dot com, zack at codesourcery dot com
Cc: davem at redhat dot com, gcc at gcc dot gnu dot org
Date: Sun, 19 May 2002 13:55:31 -0400 (EDT)
Subject: Re: gcc compile-time (multibyte issue)

<<For the record, this or something very similar is what Neil and I have
planned to do all along.  We never intended to call mbtowc() for every
character -- in fact, I at least do not intend to use the <wchar.h>
functions at all, because they are not nearly capable enough for GCC's
purposes (in my opinion).
>>

By the way, the actual algorithms for interpreting various encoding types
may be useful to look at in GNAT. The methods we currently accept are:

Hex ESC encoding (simply ESC followed by four hex digits)
Upper half encoding (used in China on PC's extensively)
Shift-JIS encoding (the most usual form used in Japan)
EUC encoding (the alternate form used in Japan)
UTF-8 (I don't think anyone uses this, but it's there :-)
Brackets encoding as in ["2345"]

The last method is useful for portable tests in that it allows wide
characters to be input in a form that is entirely in graphic characters
with no upper half or control characters.

Follow-Ups:
- Re: gcc compile-time (multibyte issue)
  - From: Joseph S. Myers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]