This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Input charsets - What's going on?
- From: Paolo Bonzini <bonzini at gnu dot org>
- To: GCC Development <gcc at gcc dot gnu dot org>
- Cc: Eric Christopher <echristo at redhat dot com>
- Date: Wed, 19 May 2004 18:36:09 +0200
- Subject: Input charsets - What's going on?
cppcharset.c's _cpp_default_encoding use nl_langinfo (CHARSET) to find
the default input charset. This usage is guarded by a configure script
symbol, HAVE_LANGINFO_CODESET. Too bad, in the current GCC sources,
HAVE_LANGINFO_CODESET is never defined and is not even in the
auto-host.h template. This means that GCC never attempts any input
charset conversion except if -finput-charset is given.
While moving libcpp to the toplevel, I put the test correctly in the
libcpp configure script so that nl_langinfo (CODESET) is now used to
find out the default input charset. Problem is, this fails miserably.
fold-const.c and java/typeck.c had two non-breaking spaces (Unicode 160)
which I have already committed fixes for, and libgfortran's files have
Tobi's name in it which sports an umlaut-u character (Unicode 252): both
of these break the default codeset on my machine, which is
ANSI_X3.4-1968, and cause the bootstrap to error because a conversion
failure.
Is the feature broken by design? How should I proceed in the standalone
libcpp?
Paolo