This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)
- To: Jason Merrill <jason at redhat dot com>
- Subject: Re: Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)
- From: Alexandre Petit-Bianco <apbianco at cygnus dot com>
- Date: Mon, 15 Jan 2001 11:10:09 -0800 (PST)
- Cc: gcc-patches at gcc dot gnu dot org, java-discuss at sources dot redhat dot com
- References: <200101150758.XAA16134@deliverance.cygnus.com><u966jhhvk2.fsf@casey.cambridge.redhat.com>
- Reply-To: apbianco at cygnus dot com
Jason Merrill writes:
> 1) It doesn't allow for C-like symbols, which have no length specifier.
> This could be fixed by defining some encoding starting with, say, '_U'.
> 2) It doesn't accommodate 32-bit extended characters in C++/C99
> (\UNNNNNNNN). This could be fixed by escaping them with, say, '_L'.
> 3) _NNNN is a valid component of an identifier, complicating the
> demangler intelligence. This could be fixed by also escaping the '_'
> character in affected names. Hmm...it looks like you intend to do
> so in unicode_mangling_length, but don't actually do so in
> append_unicode_mangled_name. We could also just use '__'.
So you basically suggest that __UNNNN be emitted for every unicode
characters that we encounter. __LNNNNNNNN would be emited for 32-bits
extended characters (Java doesn't have to worry about it.)
And Java would be dropping the `U' at the end of the symbol too.
> With these fixes, I think the current scheme is OK. But for targets
> with 8-bit clean binutils, I think it makes a lot of sense to just
> use the UTF8 encoding in the symbol.
That's fine too, but requires coordinated changes in binutils.
./A