This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: revised proposal for GCC and non-Ascii source files



I only have a couple comments:

- C9x CD2 unambiguously says \u and \U escapes are to be treated as Unicode. 
It also disallows these escapes for certain ranges of Unicode which
encompass all of 7-bit ASCII.  That being so, I propose to encode \u and \U
in UTF-8 always.  This can be done regardless of the availability of
translation libraries.  Assuming cpp and cc1 will take any character with
the high bit set in an identifier, we need only add parsing support to cpp
to make this work.

The only issue is unification of a \u escape for symbol X with the same
symbol natively represented in the input encoding.  I'm not sure what the
right way to deal with that is.

- #pragma charset is easily implementable in cpplib (not sure about cccp)
provided we accept constraints on #pragma/_Pragma().  I posted some lengthy
discussion of this last week, but to sum: pragmas affecting the preprocessor
(which this is) cannot be expressed with _Pragma() at all, and neither
#pragma nor _Pragma() can appear in a position that is inconvenient to the
parser -- I think that will translate to "must look like a C
statement-or-declaration".

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]