This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Input charsets - What's going on?


cppcharset.c's _cpp_default_encoding use nl_langinfo (CHARSET) to find the default input charset. This usage is guarded by a configure script symbol, HAVE_LANGINFO_CODESET. Too bad, in the current GCC sources, HAVE_LANGINFO_CODESET is never defined and is not even in the auto-host.h template. This means that GCC never attempts any input charset conversion except if -finput-charset is given.

While moving libcpp to the toplevel, I put the test correctly in the libcpp configure script so that nl_langinfo (CODESET) is now used to find out the default input charset. Problem is, this fails miserably. fold-const.c and java/typeck.c had two non-breaking spaces (Unicode 160) which I have already committed fixes for, and libgfortran's files have Tobi's name in it which sports an umlaut-u character (Unicode 252): both of these break the default codeset on my machine, which is ANSI_X3.4-1968, and cause the bootstrap to error because a conversion failure.

Is the feature broken by design? How should I proceed in the standalone libcpp?

Paolo


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]