This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: thoughts on martin's proposed patch for GCC and UTF-8
- To: rms at gnu dot org
- Subject: Re: thoughts on martin's proposed patch for GCC and UTF-8
- From: Zack Weinberg <zack at rabi dot columbia dot edu>
- Date: Wed, 23 Dec 1998 21:11:00 -0500
- cc: zack at rabi dot columbia dot edu, amylaar at cygnus dot co dot uk, martin at mira dot isdn dot cs dot tu-berlin dot de, gcc2 at gnu dot org, egcs at cygnus dot com
On Wed, 23 Dec 1998 18:16:42 -0700 (MST), Richard Stallman wrote:
>The idea of translating everything into UTF-8 is not useful for C.
>
>It is pointless and mistaken to translate symbols to UTF-8. The
>assembler won't accept them in UTF-8, and users who use other
>encodings wouldn't want them in UTF-8 anyway.
I think you may have missed a few things. gas has no problem with
symbols in UTF-8 (I am told). ASCII <-> UTF-8 is a no-op, and gcc
does not currently accept non-ASCII identifiers, so no existing code
will be broken by the change. Converting all symbol names to UTF-8 is
desirable for all languages for two reasons. First, Java requires
this and we want to be able to link modules written in Java with
modules written in any other language supported by gcc. Second, we
want to be able to link modules written in encoding X with other
modules in encoding Y. One way translation of all identifiers to UTF8
achieves this.
>It is pointless and buggy to translate strings to UTF-8 and then
>translate them back. As Handa pointed out, it's impossible to
>translate them back.
No argument here.
zw