[Bug preprocessor/92987] New: -finput-charset is only usable with encodings that are supersets of ASCII
lhyatt at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed Dec 18 15:45:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92987
Bug ID: 92987
Summary: -finput-charset is only usable with encodings that are
supersets of ASCII
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: preprocessor
Assignee: unassigned at gcc dot gnu.org
Reporter: lhyatt at gmail dot com
Target Milestone: ---
-finput-charset supports converting all encodings supported by iconv, and also
UTF-32 and UTF-16 are supported directly with routines in libcpp/charset.c.
However, -finput-charset does not seem to actually be usable unless the chosen
encoding is a superset of ASCII, because it applies to all header files
included from the source as well. Even an empty source file implicitly includes
/usr/include/stdc-predef.h, and so there is nothing that can be compiled with
say -finput-charset=UTF-32LE:
$ echo -n > t.c
$ gcc -S -finput-charset=UTF-32LE t.c
cc1: error: failure to convert UTF-32LE to UTF-8
The error comes while processing stdc-predef.h.
I was about to work on adding support for -finput-charset into diagnostics
infrastructure (it currently ignores it), however it seems like this issue
should probably be dealt with first, since it may entail adding the notion that
different source files have a different input encoding. I am not sure what
would be the desired way to address it. Are there use cases where it is
desirable that -finput-charset applies to the #includes too (I guess systems
could exist where the system headers are not ASCII)? Would it make sense to add
a new option that changed the charset only for source files, and not the
#includes? Or maybe it should be kept for "..." includes and not for <...> or
something like this?
-Lewis
More information about the Gcc-bugs
mailing list