cpplib to-do list

Per Bothner bothner@cygnus.com
Wed May 5 15:52:00 GMT 1999


> I would certainly like to use [full character conversion] in general,
> but there is no portable way to achieve it.

There is a portable way to achieve it:  You program conversions
for the encoding you wants to support.  Since we cannot assume any
multi-byte or wide character support in the C libraries, we have
to provide our own.  If people want to use an unsupported conversion,
they have to write a conversion filter.

We have a related issue in Java, where you can specify characters
encoding to use for input and output.  I have implemented the
framework in libgcj.  I have also implemented a program jv-convert
which can convert from any supported encoding to any other
supported encoding.  (Currently supported are UTF8, Latin-1,
SJIS, EUCJIS and "JavaSrc" - Ascii with \uxxx-escapes.)

There is in principle nothing preventing gcc
from calling jv-convert before calling cpp, and using jv-convert
to convert the input encoding to UTF8.  There are some practical
issues, in that jv-convert and libgcj is written in a mixture
of Java, C, and C++, plus the fact that libgcj is currently
only supported as a target library, not a host library.

One difference for Java is that "the competition" (Sun) already
does support compiling source files in a large number of input
encodings, so we know this is something we *should* support,
even if we're not going to get to it immediately.

	--Per Bothner
bothner@cygnus.com     http://www.cygnus.com/~bothner





More information about the Gcc mailing list