Source character set detection

gcc-help@xargs.com gcc-help@xargs.com
Mon Jun 28 12:06:00 GMT 2010


When I compile this code:

#include <stdio.h>

int main(void)
{
     char c = '?;                       /* ISO-8859-1 0xFC */

     printf("%c\n", c);

     return 0;
}

with gcc 4.3.2 under Linux with the locale specifying UTF-8 encoding,
but the source file having ISO-8859-1 encoding, I don't get any
diagnostics, and the output of the printf is a binary 0xFC.  I get the
same results if I compile with

-finput-charset=iso8859-1 -fexec-charset=iso8859-1

or

-finput-charset=utf-8 -fexec-charset=utf-8

My understanding is that gcc should default to UTF-8 source encoding,
and should give a diagnostic when it encounters the illegal UTF-8 start
byte of 0xFC.  I get the expected diagnostic if I compile with

-finput-charset=utf-8 -fexec-charset=iso8859-1

(converting to execution character set: Invalid argument)

Is gcc detecting the source character set and switching the execution
character set to match it?  I couldn't find this mentioned in the
documentation.

--
John W. Temples, III



More information about the Gcc-help mailing list