This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

template mangling (was: exception handling poll)



I wrote:
> > And we'll pretty much want to to get more efficient mangling of
> > template functions.  Just for fun, try looking at the mangled symbols
> > generated for the methods of map<string,string> .
> 
> Aiee.  Tell me about it.  I recently fixed a reported problem in the
> assembler with ECOFF debugging that bombed because the stab for the
> class exceeded 4k, primarily due to a 336 character class name.

I had a proposal to fix the problem a year ago, but I've had no time to
work on it and it seems I won't in the immediate future.

In case anyone else on this list is interested, I'll post it again.
(it refers to 2.7.2, so map<string,string,less<string> > is a bit
different now).

Any volunteers to hack something like this up?

-----------------------------------------------------------------------
As some of you may have noticed, I've been griping about the huge
symbols that result from use of g++.  Things can be improved a lot
by revising the name mangling scheme to encode repeated types better.

The basic idea for the modification to the mangling scheme is that we add
a class list, which contains the n'th class seen so far, in left to right
order.  This augments the argument list, which is already kept.  A class
is "seen" when we reach the end of it, so for vector<Foo> we first see
Foo, then vector<Foo>.  These are coded in the template string using some
unused key letter, such as B.  We apply the repeated-argument encoding (T)
first.

There's a potential problem if the mangler and demangler don't agree
on the order in which the classes are seen, but I don't think that
this problem can arise here, since the class names should appear in
the same order in both demangled and mangled forms.

The coding is particularly effective on complex template types but
helps in other cases as well.

Here are some simple examples:

EgAndInstance::EgAndInstance(const EgAndInstance&)

was

___13EgAndInstanceRC13EgAndInstance

becomes

___13EgAndInstanceRCB0

B0 = class list entry 0.

ostream::operator<<(ostream &(*)(ostream &))

(this appears with manipulators) was

___ls__7ostreamPFR7ostream_R7ostream

becomes

___ls__7ostreamPFRB0_B0

Now here's a tricky one.  This symbol appears when we use
a map<string,string,less<string> >.

rb_tree<basic_string<char, string_char_traits<char> >,
pair<basic_string<char, string_char_traits<char> > const,
     basic_string<char, string_char_traits<char> > >,
select1st<pair<basic_string<char, string_char_traits<char> > const,
basic_string<char, string_char_traits<char> > >, basic_string<char,
string_char_traits<char> > >, less<basic_string<char,
string_char_traits<char> > > >::__copy_hack(void *, void *)

To make this symbol easier to understand, it is

rb_tree<string,
	pair<string const, string >,
	select1st< pair<string const, string >, string >,
	less<string > >
::__copy_hack(void *, void *)

We will build the following class list.

B0 = string_char_traits<char>
B1 = basic_string<char,B0> = string
B2 = pair<const B1,B1>
B3 = select1st<B2,B1>
B4 = less<B1>

(only B1 and B2 are re-used)

The old mangling is

___copy_hack__t7rb_tree4Zt12basic_string2ZcZt18string_char_traits1ZcZt4pair2ZCt

12basic_string2ZcZt18string_char_traits1ZcZt12basic_string2ZcZt18string_char_traits1ZcZt9select1st2Zt4pair2ZCt12basic_string2ZcZt18string_char_traits1ZcZt12basic_string2ZcZt18string_char_traits1ZcZt12basic_string2ZcZt18string_char_traits1ZcZt4less1Zt12basic_string2ZcZt18string_char_traits1ZcPvT1

The new mangling becomes (if I did this right)

___copy_hack__t7rb_tree4Zt12basic_string2ZcZt18string_char_traits1ZcZt4pair2ZCB1B1t9select1st2ZB2B1t4less1ZB1PvT1

Something like this is going to be essential to have STL work on platforms
that limit symbol name lengths (HP is one).  (Sun as has a 2048-character
limit in stabs).

------------
If something like this is developed, there should probably be a compiler
switch to enable the old or new schemes.  The same cplus_demangle could
handle either scheme.






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]