This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
Re: c/3804: Extended ASCII "wide" characters not behaving with UTF-8 locale
- To: Alex Eulenberg <alex at rent-a-mind dot com>
- Subject: Re: c/3804: Extended ASCII "wide" characters not behaving with UTF-8 locale
- From: Neil Booth <neil at daikokuya dot demon dot co dot uk>
- Date: Wed, 25 Jul 2001 22:07:24 +0100
- Cc: gcc-bugs at gcc dot gnu dot org
Alex Eulenberg wrote:-
> Second item: According to the Sun compiler documentation
> http://docs.sun.com/htmlcoll/coll.33.7/iso-8859-1/CUG/tguide.html#785, for
> all ANSI/ISO Compilers, "When the compilation system encounters a wide
> character constant or wide string literal, each multibyte character is
> converted into a wide character, as if by calling the mbtowc() function."
and the sentence should finish "with an implementation-defined current
locale". We don't want to restrict ourselves to a single locale for a
translation unit, since a file can include headers where a different
locale might be appropriate.
We have not decided exactly what we will do, but it is looking likely
that we will assume each source file is UTF8 unless otherwise
indicated. "Otherwise indicated" might be with a magic comment at the
top of the file, like Emacs.
Neil.