This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: thoughts on martin's proposed patch for GCC and UTF-8


> Here is another possibility.  For identifier chars that can be
> expressed as multibyte chars in the locale's encoding, use those
> chars; otherwise, use `.uxxxx' or `.Uxxxxxxxx' where xxxx (or
> xxxxxxxx) are the Unicode position.

[...]

> I don't know how this would affect C++ mangling, though.

This won't work for C++. Consider

class Foo{
        static int u1234;
};

This currently compiles into _3Foo.u1234. With your proposal,
_3Foo.u1234.u1234 could either be Foo\u1234::u1234, or
Foo::u1234\u1234.

If people don't like converting Unicode identifiers to UTF-8 always, I
drop that proposal with regrets. It would work on assemblers that
support 8bit in identifiers, it would work for C and C++, and it would
work independently from compile-time or runtime settings (identifiers
are *not* effected by the users locale whatsoever).

Anyway, I drop that proposal. There is a proposed mangling for \u
escapes in C++ in gxxint.texi. It works for all cases and for all
assemblers, giving plain text in identifiers. It doesn't work for C,
but after this discussion, I guess I don't care about that anymore.
Somebody just tell me how it should work for C.

Kind regrets,
Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]