This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Current state of multibyte character support
- From: Matt Hiller <hiller at redhat dot com>
- To: gcc at gcc dot gnu dot org
- Cc: brolley at redhat dot com, <neilb at earthling dot net>
- Date: Sun, 7 Apr 2002 21:47:38 -0700 (PDT)
- Subject: Current state of multibyte character support
Configuring gcc with --c-enable-mbchar allows gcc to process input files
where strings, comments, and #include filenames are encoded with S-JIS,
JIS and EUCJP instead of plain ASCII.
According to a thread I found on gcc@gcc.gnu.org from December 1998
(http://gcc.gnu.org/ml/gcc/1998-12/msg00171.html), shift-jis can cause
compilers trouble in that it uses '\' to encode Japanese characters. The
work that I've done recently leads me to say that problems like this are
cropping up again, especially in cpplex.c.
The most recent work on cpplex.c seems to be Neil Booth's. If I read
aright (http://gcc.gnu.org/ml/gcc/2000-09/msg00268.html), Neil tried
to do this work such that multibyte support could eventually be
added. Anyone know how successful that ultimately was, or what other
issues may present themselves?
Thanks much,
Matt
p.s.: I found bugs in c-lex.c:lex_string, but I believe I have them worked
out.