This is the mail archive of the
mailing list for the Java project.
Re: Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)
>>>>> "Alexandre" == Alexandre Petit-Bianco <email@example.com> writes:
> Jason Merrill writes:
>> I meant _NNNN and _LNNNNNNNN, actually, with a literal _ encoded as __.
>> Only the last would actually require a change in the Java frontend.
> OK. I have a patch for this. We now mangle something like `<clinit>'
> to _ZN1f16_003cclinit_003eEv. A hypothetical `<cl_init>' would be
> mangled `_ZN1f17_003ccl__nit_003eEv'.
We still need the U before the length marker, or we have no way of
determining whether or not to look for the escape sequences. We can't use
this encoding for non-Unicode names, as their handling is specified by the
I wish there was a way to handle this only by modifying the names
themselves, without the leading U, but I can't think of a non-ambiguous
escape sequence we could use on all targets. Well, actually, I suppose we
could use '__', as you were suggesting in your earlier mail; since all
identifiers containing '__' are reserved to the implementation, we wouldn't
have to worry about violating the (multi-vendor) ABI.
No leading/trailing Us
UCS2 values are encoded as '__NNNN'
UCS4 values are encoded as '__LNNNNNNNN'
'__' is encoded as '___'.
'_' followed by anything else is left alone.
>>>>> "Per" == Per Bothner <firstname.lastname@example.org> writes:
> Alexandre Petit-Bianco <email@example.com> writes:
>> > What was [final U] used for?
>> I honestly don't know. Maybe Per remembers.
> To indicate that a method name uses Unicode escapes. A mangled
> class name is a number followed by the name. So we can add a
> 'U' before the number without causing ambiguity. But we can't do
> that for a method name, since the mangling of a method name just
> goes right into it. Hence the 'U' is tacked to the end.
OK, thanks. But in the new abi, method names are treated the same as other
names, so that's not necessary.