This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [PATCH] Preliminary fix for codecvt_members_unicode_wchar_t
- From: Paolo Carlini <pcarlini at unitus dot it>
- To: Benjamin Kosnik <bkoz at redhat dot com>
- Cc: libstdc++ at gcc dot gnu dot org
- Date: Tue, 26 Mar 2002 00:05:18 +0100
- Subject: Re: [PATCH] Preliminary fix for codecvt_members_unicode_wchar_t
- References: <Pine.SOL.3.91.1020325141122.17325A-100000@taarna.cygnus.com>
Benjamin Kosnik wrote:
[snip]
Thank you very much for these additional details.
In the next few days I will try to learn more myself about all of this
starting from these notes.
>So, in summary, it looks like this is the deal, even if this directly
>contradicts my earlier email.
>
>1) UCS4, UCS2 need a byte-order marker (bom) to indicate endianness.
>if there is no bom, then encodings assume native byte order. This varies
>per machine, as has been found out with the x86/powerpc divergence.
>
Ah! Ok. Now this is much more clear. Also the different kind of problem
shown on powerpc/s390 by the wchar_t test vs the char test.
>2) UCS4-BE, UCS2-BE should not need a bom to indicate endianness, as it
>is explicitly specified.
>
I see.
>I hope this helps explain the situation. If I'm wrong, please let me know
>and I'll try to confuse the situation some more. I realize this sounds
>really complicated at the moment. Writing docs that explain this is on my
>TODO list for May.
>
Great! At the moment, to my best knowledge, there are not many clear
explanations available...
By the way, have you had at look at my "consistency" fix for
collate_byname.cc? It is not strictly needed but changes those tests
consistently with the current collate_members_xx.cc.
Ciao, Paolo.