This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug fortran/48972] OPEN with Unicode file name
- From: "burnus at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 12 May 2011 14:23:06 +0000
- Subject: [Bug fortran/48972] OPEN with Unicode file name
- Auto-submitted: auto-generated
- References: <bug-48972-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48972
--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-05-12 13:37:34 UTC ---
(In reply to comment #3)
> Wouldn't a standard-conforming way to support Unicode file names be for
> gfortran to
I am admittedly a bit lost.
> - Specify that the default character set is UTF-8.
What do you mean by that? I know 1 byte and 4 byte character variables, but I
do not see where UTF-8 fits in there. (One can place UTF-8 into
character(kind=1) - and it also kind of works OK. But if one wants to use
len(), string manipulation ("change 3 character to ..."), or tabulated I/O that
will fail. But as quirky workaround, one can use UTF-8 file names with kind=1
character variables - at least under Unix/Linux.)
Regarding the ENCODING= specifier: That's already used for the encoding of the
file content - one shan't use it to also modify the interpretation of the FILE
string.
I still think that the default character encoding should remain 1 byte
(kind=1), which is simply passed as is to "open()". And UCS-4 as FILE= argument
should simply be supported as vendor extension. One just needs to tell the
library that the string is in UCS-4. This wide string could then directly used
for Windows' _wopen or converted to UTF-8 for Unix/Linux. (The conversion
routine exists for UCS-4 <-> UTF-8 I/O.)