This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: revised proposal for GCC and non-Ascii source files
- To: zack at rabi dot columbia dot edu
- Subject: Re: revised proposal for GCC and non-Ascii source files
- From: "Martin v. Loewis" <martin at mira dot isdn dot cs dot tu-berlin dot de>
- Date: Tue, 5 Jan 1999 00:05:45 +0100
- CC: eggert at twinsun dot com, rms at gnu dot org, bothner at cygnus dot com, amylaar at cygnus dot co dot uk, gcc2 at gnu dot org, egcs at cygnus dot com
- References: <199901042115.QAA13627@rabi.phys.columbia.edu>
> This raises the issue of how we tell native extended character X from native
> ASCII character %. I'm beginning to suspect we need the more general locale
> information, not just the charset.
Paul suggested that you can test whether you are in the initial state
when using the multibyte functions. If that is true, and if we assume
that the initial state is ASCII (as mandated by the C and C++
standards), we have a test whether a single byte we just saw really is
from the base character set.
Furthermore, the standards require that we are in the initial state
after each identifier. So we have good reason to reject
<escape>printf<funny characters>("Hello world");<escape to ASCII>
The escape back to ASCII must occur right after <funny characters>,
and should probably count as part of the identifier.
Regards,
Martin