This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libfortran/35863] [F2003] Implement ENCODING="UTF-8"
- From: "fxcoudert at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 Apr 2008 10:45:41 -0000
- Subject: [Bug libfortran/35863] [F2003] Implement ENCODING="UTF-8"
- References: <bug-35863-10743@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #2 from fxcoudert at gcc dot gnu dot org 2008-04-15 10:45 -------
(In reply to comment #0)
> Front end and library are ready to handle this when implemented.
Front-end is ready? Is ENCODING="UTF-8" related to UCS-4 support? Because if it
is, then the front-end is not ready, it only supports a single character kind.
(In reply to comment #1)
> This could be a bit tricky to get right. OTOH Fortran is fortunate enough that
> there are real strings and not char arrays like in C, so from a user
> perspective it should be pretty transparent.
Well, I'm not too sure it's hard. We are not required to support UTF-8 strings
as a character kind (that would be really hard) but just UCS-4 strings (ie
UTF-32), which is basically (as I see it):
- remove limitations in the front-end that there is only one character kind]
- make a new character kind, as an array of 32-bit integers and a length
- adjust library functions
Then, I/O with UTF-8 encoding just needs UTF-8 <--> UTF-32 conversions, which
is only a few dozen lines of code (unless I'm confused).
--
fxcoudert at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fxcoudert at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35863