This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: thoughts on martin's proposed patch for GCC and UTF-8
> Sorry, I'm still lost. If the identifier is the UTF-8 character MICRO
> SIGN (code 00B5), do you generate the same UTF-8 character on output,
> or do you mangle it as if the user had typed `\u00b5'?
Suppose I have a
class µ{
µ(); //This should read MICRO SIGN
};
Then, the compiler tests at installation time whether the assembler on
the system is 8-bit-clean. If it is, the constructor is mangled as
__2\302\265v
If the assembler does not support 8-bit symbols, it is mangled as
__U5_00b5
This is what jc1 currently does.
> echo ab | tr 'ab' '\123\456'
Thanks, this looks good.
> OK, so then there's no problem: C++ _does_ distinguish between
> non-ASCII digits and letters.
Right. It just doesn't distinguish between non-ASCII digits and
non-ASCII non-alphanumerics :-) That's why no predicate function
was needed.
Regards,
Martin