This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: cpplib to-do list

To: Branko Cibej <branko dot cibej at hermes dot si>
Subject: Re: cpplib to-do list
From: Per Bothner <bothner at cygnus dot com>
Date: Wed, 12 May 1999 13:33:10 -0700
cc: egcs at egcs dot cygnus dot com

> Quite. That's what internationalisation and localisation are for. Are you
> suggesting the compiler should do that instead of the programmer?

Well, yes the compiler should translate from source encoding to the
internal encoding.  That is what you have to do for Java.

My point was that there is no reason to assume that the internal
encoding should be the same as the source encoding.  However, it is
one option that we should support.

Note I am not proposing that we select Unicode/UTF8 as the only
supported internal encoding.  However, it should be the
default, and it is the one we should expend effort on so it is
supported well by GNU tools and libraries.

> And these conversion must preserve information

Impossible.  Most non-identity conversions that people need to do
lose information.  The one closest to not losing information in
practice is to convert to Unicode (or UTF8).

> (e.g., ISO2022<->UTF8 does not preserve information).

Does anybody actually use (full) ISO2022?  In such as way that translating
to UTF8 would actually lose useful information?  My understanding is
that in practice people use various variants and subsets that are not
true 2022.  In any case, ISO2022 is not impractical as an internal encoding.

	--Per Bothner
bothner@cygnus.com     http://www.cygnus.com/~bothner

References:
- Re: cpplib to-do list
  - From: Branko Cibej

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]