This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Selectable execution character set (and a whole bunch of sideeffe
kaih@khms.westfalen.de (Kai Henningsen) writes:
>> >> Because it does exactly what I need: it's isomorphic to Unicode (so
>> >> there is no problem implementing \u escapes) and unibyte characters
>> >> match EBCDIC code points (so I don't have to replace every last
>> >> character constant in cpplib with a number, at great cost to
>> >> readability).
...
>> > However, wouldn't
>> > char c = L'A';
>> > work (in gcc) for getting 0x41 even with a local charset of EBCDIC?
>>
>> No. I believe that historical releases of GCC, configured for
>> i370-ibm-mvs or -oe, would produce the EBCDIC value of 'A' padded to
>> the width of wchar_t and then truncated back to 'char'.
>
> Ugh. That seems incredibly broken.
Arguably so, yes, but I cannot go back in time and change it.
>> I do not know what encoding wchar_t is supposed to use on an EBCDIC
>> platform; I inquired of some people who ought to know, but never heard
>> back.
>
> I see no reason not to use Unicode there ... let me look at the '91 SAA
> CPI C Reference (Level 2). Hmm. They're being coy.
I got a message from Ulrich Weigand which pointed me to various
documents at http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/library
(look for "z/OS: C/C++ Programming Guide") with which I can confirm
that wchar_t uses locale-specific 16-bit encodings. I don't see any
evidence that Unicode is used.
Getting all this stuff absolutely right is a lot of work which I am
not particularly motivated to do. What I've done will do
approximately the right thing, provided that the system library
cooperates, and I will worry about the remainder of the issues if and
when they come up.
zw