This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: thoughts on martin's proposed patch for GCC and UTF-8

To: martin at mira dot isdn dot cs dot tu-berlin dot de
Subject: Re: thoughts on martin's proposed patch for GCC and UTF-8
From: Ian Lance Taylor <ian at cygnus dot com>
Date: Thu, 10 Dec 1998 10:57:10 -0500
CC: eggert at twinsun dot com, brolley at cygnus dot com, gcc2 at gnu dot org, egcs at cygnus dot com

   Date: Thu, 10 Dec 1998 08:12:20 +0100
   From: Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de>

   > If the object-code standard is to use UTF-8 names, then I suppose the
   > assembler can convert to UTF-8.

   No. The gas people made it very clear that they consider character sets
   somebody else's problems (i.e. ours).

That is too strong.  For hand coded assembler, I can see that there
may be a need for gas to do some character set conversions.  Also, if
it is ever possible for an identifier name to include a byte value
which gas will consider to be an operator, then it is clearly
necessary for gas to permit quoting that byte value, and perhaps to do
more general character set conversions.

In general, though, if gcc needs to understands character set issues,
which appears to be the case, and if it can emit identifiers in a
manner which will not confuse gas, then I think it is reasonable for
gcc to emit identifiers as uninterpreted byte sequences, and for gas
to simply pass those identifiers straight through into the object
file.

I can't claim to understand many of the issues here, though.

Several people have mentioned the linker as an issue.  To the best of
my knowledge, the linker will permit any byte value except 0 to appear
in an identifier.  I don't see why the linker has to change at all for
any character set issues.

Ian

Follow-Ups:
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Martin von Loewis
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Paul Eggert
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Ken Raeburn

References:
- Re: thoughts on martin's proposed patch for GCC and UTF-8
  - From: Martin von Loewis

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]