This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug libfortran/35863] [F2003] Implement ENCODING="UTF-8"

From: "fxcoudert at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: 15 Apr 2008 10:45:41 -0000
Subject: [Bug libfortran/35863] [F2003] Implement ENCODING="UTF-8"
References: <bug-35863-10743@http.gcc.gnu.org/bugzilla/>
Reply-to: gcc-bugzilla at gcc dot gnu dot org


------- Comment #2 from fxcoudert at gcc dot gnu dot org  2008-04-15 10:45 -------
(In reply to comment #0)
> Front end and library are ready to handle this when implemented.

Front-end is ready? Is ENCODING="UTF-8" related to UCS-4 support? Because if it
is, then the front-end is not ready, it only supports a single character kind.

(In reply to comment #1)
> This could be a bit tricky to get right. OTOH Fortran is fortunate enough that
> there are real strings and not char arrays like in C, so from a user
> perspective it should be pretty transparent.

Well, I'm not too sure it's hard. We are not required to support UTF-8 strings
as a character kind (that would be really hard) but just UCS-4 strings (ie
UTF-32), which is basically (as I see it):
  - remove limitations in the front-end that there is only one character kind]
  - make a new character kind, as an array of 32-bit integers and a length
  - adjust library functions

Then, I/O with UTF-8 encoding just needs UTF-8 <--> UTF-32 conversions, which
is only a few dozen lines of code (unless I'm confused).


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fxcoudert at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35863

References:
- [Bug libfortran/35863] New: [F2003] Implement ENCODING="UTF-8"
  - From: jvdelisle at gcc dot gnu dot org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]