This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: thoughts on martin's proposed patch for GCC and UTF-8

To: bothner at cygnus dot com
Subject: Re: thoughts on martin's proposed patch for GCC and UTF-8
From: Paul Eggert <eggert at twinsun dot com>
Date: Mon, 21 Dec 1998 19:43:51 -0800 (PST)
CC: rms at gnu dot org, amylaar at cygnus dot co dot uk, martin at mira dot isdn dot cs dot tu-berlin dot de, gcc2 at gnu dot org, egcs at cygnus dot com
References: <199812220245.SAA05358@cygnus.com>

   Date: Mon, 21 Dec 1998 18:45:09 -0800
   From: Per Bothner <bothner@cygnus.com>

   Yes, we could have auto-detection for C but not Java,
   but that does seem rather clumsy.

It would be nice to use the same method for all languages, yes.
This is a good argument against autodetection.

   libc should be written in UTF-8, but an
   application may be written in a local character set.

libc's identifiers use only the "C" subset of ASCII, and therefore
libc will link to an application written in any locale, even if we use
the native multibyte encoding for identifiers.

   Given that [.o] symbols have to be in a common character encoding,
   it follows that you cannot possibly do autodetection, at least not
   for identifiers.

I don't see how this follows.  The compiler could use autodetection to
discover the input character set, and then translate the identifiers'
characters to UTF-8 when outputting assembly language.

Follow-Ups:
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Per Bothner

References:
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Per Bothner

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]