Ambiguous G++ name mangling grammar

I. Cvar
Mon Oct 25 12:17:00 GMT 1999

I'm trying to use gcc version egcs-2.91.60 19981201 (egcs-1.1.1 release) on
The mangled names produced by g++ appear to be suffering from an ambiguous
grammar, and therefore, cannot be readily parsed during demangling (see
There are 2 related problems that I see.  They are explained below.
I need to know if there are work-arounds or fixes for these bugs.
In particular, is there a fix for cplus-dem.c available?
Secondly, can a g++ switch be enabled to avoid these mangling bugs?

Function arguments inside the mangled name of a G++ template function
specialization are collectively prefixed with an 'H' (ok) and suffixed with
a '_' (bad) followed by the return type.  That underscore character is the
heart of the problem.
During demangling (eg, cplus-dem.c), that '_' suffix is wrongly interpreted
as part of a multi-digit number when the very last template argument happens
to be described by a mangling code that actually has 2 trailing 1-digit
Notice that these examples demonstrate bugs with 'N', but I'm not sure
whether they can be also shown with B, T, I, n, or others, when 2 or more
digits happen to precede the '_' suffix belonging to the function's return

Example 1a:
T maximum (T value1, U value2, U value3, U value4 )
return value1;
main ()
maximum (1, 2, 3, 4);

It's mangled as "maximum__H2ZiZi_X01X11N21_X01", and so the "N21_" part is
treated as 21 repeats of ?junk? instead of 2 repeats of type 1.
Example 1b:
struct fcc {} XXX;
template <class T, class U>
fcc maximum (T value1, T value2, T value3, T value4, T value5,
U walue1, U walue2, U walue3, U walue4, U walue5)
return XXX;
main ()
maximum (111, 222, 333, 444, 555, 'a', 'b', 'c', 'd', 'e');

It's mangled as "maximum__H2ZiZc_X01N40X11N45_3foo", and so the "N45_3" part
now looks perfectly valid (but it's not!).  In fact, it's treated as 45
repeats of type 3, which wrongly slurps up the "_3" part, and that means the
trailing '_' suffix is now missing from in front of the template function's
return type, leaving only "fcc" to be treated as even more bogus arguments
of type float, char, and char!  Yes, that amounts to (6 + 45 + 3 =) 54 args
instead of the correct 10, not to mention the lack (overlooked by
cplus-dem.c) of a return type.


Encoding of any 2 adjacent numbers, (as in "Nxy", below) is wrong whenever
the x is a 1-digit number and the y is a 2+ digit number.  That's because
the presence of an '_' following the last digit of y indicates that we're
playing with a multi-digit number, and so the xy (together) is interpreted
as one huge number instead of two smaller ones.  Got it?
Notice that this has nothing to do with the '_' preceding the return type,
as in bug 1.  Bug 2 can occur anywhere, not just in the very last argument.
Example 2:
template <class T, class U, class V>
  V   maximum (T value1, T value2, T value3, T value4, T value5,
               U walue1, U walue2, U walue3, U walue4, U walue5,
               V xalue1, V xalue2, V xalue3, V xalue4, V xalue5,
               int I, char C)
return xalue1;

main ()
maximum (1.1, 2.2, 3.3, 4.4, 5.5,
'a', 'b', 'c', 'd', 'e',
1.1, 2.2, 3.3, 4.4, 5.5,
111, 'a');

It's mangled as "maximum__H3ZdZcZd_X01N40X11N45X21N410_ic_X21", and so the
"N410_ic" part is not at all confused by the '_' which belongs to the
template function instance.  Now it's just an ordinary sequence of two
numbers that got smeared into one for a different reason.  Now it's wrongly
interpreted as "410 repeats of type number ?junk?", instead of "4 repeats of
type 10 followed by an integer".

Ivan B. Cvar, Senior Software Engineer, OC Systems Inc.
9990 Lee Highway, Suite 270, Fairfax, Virginia 22030-1720, U.S.A.
voice: +1.703.359.8160x169    fax: +1.703.279.2799
email:      www:

More information about the Gcc-bugs mailing list