This is the mail archive of the
mailing list for the Java project.
Re: Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)
> I wish there was a way to handle this only by modifying the names
> themselves, without the leading U, but I can't think of a non-ambiguous
> escape sequence we could use on all targets. Well, actually, I suppose we
> could use '__', as you were suggesting in your earlier mail; since all
> identifiers containing '__' are reserved to the implementation, we wouldn't
> have to worry about violating the (multi-vendor) ABI.
I'd recommend against using just a leading '__' to indicate that
the name contains encoded unicode characters. While the standard does
reserve all such names to the implementation, our implementation
shares this name space with whatever C library we use. Of course,
whatever prefix you choose, it's possible that there will be a
conflict. But it would be better to choose a prefix that was less
likely to be used by the C library, and implement it in a way that
it was easy to change for a particular library.
Also, while I'm talking, I'll venture into an area I'm less confident
about. It seems to me we can't use UTF8 as an encoding from gcc to
the assembler unless the assembler allows any 8bit character in
identifiers (which seems unlikely). What the compiler actually has
to do is encode whatever character set it allows in identifiers into
whatever character set allowed by the assembler.