This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: thoughts on martin's proposed patch for GCC and UTF-8
> Here is another possibility. For identifier chars that can be
> expressed as multibyte chars in the locale's encoding, use those
> chars; otherwise, use `.uxxxx' or `.Uxxxxxxxx' where xxxx (or
> xxxxxxxx) are the Unicode position.
[...]
> I don't know how this would affect C++ mangling, though.
This won't work for C++. Consider
class Foo{
static int u1234;
};
This currently compiles into _3Foo.u1234. With your proposal,
_3Foo.u1234.u1234 could either be Foo\u1234::u1234, or
Foo::u1234\u1234.
If people don't like converting Unicode identifiers to UTF-8 always, I
drop that proposal with regrets. It would work on assemblers that
support 8bit in identifiers, it would work for C and C++, and it would
work independently from compile-time or runtime settings (identifiers
are *not* effected by the users locale whatsoever).
Anyway, I drop that proposal. There is a proposed mangling for \u
escapes in C++ in gxxint.texi. It works for all cases and for all
assemblers, giving plain text in identifiers. It doesn't work for C,
but after this discussion, I guess I don't care about that anymore.
Somebody just tell me how it should work for C.
Kind regrets,
Martin