This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch, fortran] Wide character I/O Part 1
- From: Tobias Burnus <burnus at net-b dot de>
- To: Jerry DeLisle <jvdelisle at verizon dot net>
- Cc: Fortran List <fortran at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 29 May 2008 20:55:58 +0200
- Subject: Re: [patch, fortran] Wide character I/O Part 1
- References: <483E0FBE.8090301@verizon.net>
Jerry DeLisle wrote:
This patch implements wide character I/O with default encoding for
list directed formatted and formatted I/O.
Great! I have to admit, I only loosely tested it.
I have a question: Can one implement writing character(kind=4) with
encoding="default" such that Latin 1 (ISO-8859-1) characters will not be
translated into '?' but come out as 8bit characters?
In principle I had expected that the following does so, but my tests fail:
+ *p++ = (unsigned int) source[i * kind] > 255 ?
+ '?' : source[i * kind + endian_off];
Test case:
(Note there is some format checking which also goes wrong.)
! Compile with -fbackslash
character(kind=4,len=20) :: str = 4_'X\xF8öABC'
!print '(3a0)', ':',trim(str),':' ! REJECTED, but valid?
print *, ':',trim(str),':'
end
I don't want to let the patch get too big so if I may, I would like to
commit in parts. Part 2 will take care of anything remaining for
unformatted wide char I/O and move on to ENCODING="UTF-8".
Great. With the next patch we should have the complete ISO 10646 and
UTF-8 support implemented as described in the Fortran 2003 standard.
(Modulo bugs and enhancements beyond the standard.) I think that your
next patch can then also enable selected_char_kind('ISO 10646') (see
gfc_simplify_selected_char_kind).
I have regression tested this on x86-64 and ppc64. The patch takes
care of endianess.
OK for trunk?
Looks OK. Thanks for the patch.
Tobias