This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: revised proposal for GCC and non-Ascii source files

To: zack at rabi dot columbia dot edu
Subject: Re: revised proposal for GCC and non-Ascii source files
From: "Martin v. Loewis" <martin at mira dot isdn dot cs dot tu-berlin dot de>
Date: Tue, 5 Jan 1999 00:05:45 +0100
CC: eggert at twinsun dot com, rms at gnu dot org, bothner at cygnus dot com, amylaar at cygnus dot co dot uk, gcc2 at gnu dot org, egcs at cygnus dot com
References: <199901042115.QAA13627@rabi.phys.columbia.edu>

> This raises the issue of how we tell native extended character X from native
> ASCII character %.  I'm beginning to suspect we need the more general locale
> information, not just the charset.

Paul suggested that you can test whether you are in the initial state
when using the multibyte functions. If that is true, and if we assume
that the initial state is ASCII (as mandated by the C and C++
standards), we have a test whether a single byte we just saw really is
from the base character set.

Furthermore, the standards require that we are in the initial state
after each identifier. So we have good reason to reject

<escape>printf<funny characters>("Hello world");<escape to ASCII>

The escape back to ASCII must occur right after <funny characters>,
and should probably count as part of the identifier.

Regards,
Martin

References:
- Re: revised proposal for GCC and non-Ascii source files
  - From: Zack Weinberg

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]