[Bug fortran/45179] New: Support UTF-8 (and other encodings) in the source file (.f90) for CHARACTER(kind=4)

burnus at gcc dot gnu dot org gcc-bugzilla@gcc.gnu.org
Wed Aug 4 10:16:00 GMT 2010


libcpp allows one to directly input non-ascii characters in source files (.f90
etc.); the used encoding can be set using the options:

  -finput-charset=UTF-8

Cf. also: -fexec-charset and -fwide-exec-charset
and http://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html


If one uses   gfortran -cpp -finput-charset=UTF-8 wide.f90
one currently gets the error:

f951: warning: command line option "-finput-charset=UTF-8" is valid for
C/C++/ObjC/ObjC++ but not for Fortran [enabled by default]


The files scanner.c etc. do support the reading of wide chars thus, in
principle, only few changes should be required.

Caveat: Many people still use kind=1 strings - but with non-ASCII characters;
one should try to make sure that this continues to work. Stuffing the
characters in as one currently does is one option. For Latin1 (ISO 8859-1)
characters one can also simply strip off the high bytes and write only the
first byte. Using UTF-8 also works - though len() will report too many
characters.


-- 
           Summary: Support UTF-8 (and other encodings) in the source file
                    (.f90) for CHARACTER(kind=4)
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: burnus at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45179



More information about the Gcc-bugs mailing list