This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libfortran/35863] [F2003] Implement ENCODING="UTF-8"



------- Comment #2 from fxcoudert at gcc dot gnu dot org  2008-04-15 10:45 -------
(In reply to comment #0)
> Front end and library are ready to handle this when implemented.

Front-end is ready? Is ENCODING="UTF-8" related to UCS-4 support? Because if it
is, then the front-end is not ready, it only supports a single character kind.

(In reply to comment #1)
> This could be a bit tricky to get right. OTOH Fortran is fortunate enough that
> there are real strings and not char arrays like in C, so from a user
> perspective it should be pretty transparent.

Well, I'm not too sure it's hard. We are not required to support UTF-8 strings
as a character kind (that would be really hard) but just UCS-4 strings (ie
UTF-32), which is basically (as I see it):
  - remove limitations in the front-end that there is only one character kind]
  - make a new character kind, as an array of 32-bit integers and a length
  - adjust library functions

Then, I/O with UTF-8 encoding just needs UTF-8 <--> UTF-32 conversions, which
is only a few dozen lines of code (unless I'm confused).


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fxcoudert at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35863


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]