This is the mail archive of the mailing list for the Java project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)

Jason Merrill writes:

> 1) It doesn't allow for C-like symbols, which have no length specifier.
>    This could be fixed by defining some encoding starting with, say, '_U'.
> 2) It doesn't accommodate 32-bit extended characters in C++/C99
>    (\UNNNNNNNN).  This could be fixed by escaping them with, say, '_L'.
> 3) _NNNN is a valid component of an identifier, complicating the
>    demangler intelligence.  This could be fixed by also escaping the '_'
>    character in affected names. looks like you intend to do
>    so in unicode_mangling_length, but don't actually do so in
>    append_unicode_mangled_name.  We could also just use '__'.

So you basically suggest that __UNNNN be emitted for every unicode
characters that we encounter. __LNNNNNNNN would be emited for 32-bits
extended characters (Java doesn't have to worry about it.) 

And Java would be dropping the `U' at the end of the symbol too.

> With these fixes, I think the current scheme is OK.  But for targets
> with 8-bit clean binutils, I think it makes a lot of sense to just
> use the UTF8 encoding in the symbol.

That's fine too, but requires coordinated changes in binutils.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]