cpplib to-do list
Per Bothner
bothner@cygnus.com
Wed May 5 15:52:00 GMT 1999
> I would certainly like to use [full character conversion] in general,
> but there is no portable way to achieve it.
There is a portable way to achieve it: You program conversions
for the encoding you wants to support. Since we cannot assume any
multi-byte or wide character support in the C libraries, we have
to provide our own. If people want to use an unsupported conversion,
they have to write a conversion filter.
We have a related issue in Java, where you can specify characters
encoding to use for input and output. I have implemented the
framework in libgcj. I have also implemented a program jv-convert
which can convert from any supported encoding to any other
supported encoding. (Currently supported are UTF8, Latin-1,
SJIS, EUCJIS and "JavaSrc" - Ascii with \uxxx-escapes.)
There is in principle nothing preventing gcc
from calling jv-convert before calling cpp, and using jv-convert
to convert the input encoding to UTF8. There are some practical
issues, in that jv-convert and libgcj is written in a mixture
of Java, C, and C++, plus the fact that libgcj is currently
only supported as a target library, not a host library.
One difference for Java is that "the competition" (Sun) already
does support compiling source files in a large number of input
encodings, so we know this is something we *should* support,
even if we're not going to get to it immediately.
--Per Bothner
bothner@cygnus.com http://www.cygnus.com/~bothner
More information about the Gcc
mailing list