This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] PR18785: Support non-native execution charsets

From: kaih at khms dot westfalen dot de (Kai Henningsen)
To: gcc-patches at gcc dot gnu dot org
Date: 23 Dec 2004 00:12:00 +0200
Subject: Re: [PATCH] PR18785: Support non-native execution charsets
Organization: Organisation? Me?! Are you kidding?
References: <Pine.LNX.4.44.0412220650040.8009-100000@www.eyesopen.com> <Pine.LNX.4.44.0412220650040.8009-100000@www.eyesopen.com> <87hdmeglfk.fsf@codesourcery.com>

zack@codesourcery.com (Zack Weinberg)  wrote on 22.12.04 in <87hdmeglfk.fsf@codesourcery.com>:

>   * The source character set: the encoding used by internal processing
>     in translation phases 1b-4 (1a is the conversion from input to
>     source character set).  This has several major constraints on it:
>
>       - It has to be a proper multibyte character set as C99 defines
>         that term (5.2.1.2p1).  It may NOT have a state-dependent
>         encoding.
>
>       - It has to be isomorphic to ISO 10646 (Unicode) so that \u, \U
>         escapes are meaningful.  (Because of this, the source
>         character set cannot be a single-byte encoding.)
>
>       - All characters within the basic source character set must have
>         the same code points that they do in ...
>
>    * The host character set: that is, the narrow execution character
>      set of the host machine.  At present this is always either ASCII
>      or EBCDIC, and we assume that whichever variant of EBCDIC is in
>      use does not alter the code points corresponding to the basic
>      source character set.

You do realize, I hope, that not all EBCDIC codepages have consistent  
codepoints for at least {}[] (probably more)? This makes that "must have  
the same code points" thing rather hard.

I still believe that rule is utterly misguided. Trying to use UTF-EBCDIC  
really is ALWAYS a mistake.

MfG Kai

Follow-Ups:
- Re: [PATCH] PR18785: Support non-native execution charsets
  - From: Zack Weinberg

References:
- [PATCH] PR18785: Support non-native execution charsets
  - From: Roger Sayle
- Re: [PATCH] PR18785: Support non-native execution charsets
  - From: Zack Weinberg

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]