This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH RFC: Patches to speed up the demangler
On Sun, Dec 28, 2003 at 04:01:51PM -0500, Daniel Berlin wrote:
>
> On Dec 28, 2003, at 3:56 PM, Daniel Berlin wrote:
>
> >
> >On Dec 28, 2003, at 2:47 PM, Daniel Jacobowitz wrote:
> >
> >>On Sun, Dec 28, 2003 at 10:56:43AM -0800, Mark Mitchell wrote:
> >>>On Sat, 2003-12-20 at 12:16, Ian Lance Taylor wrote:
> >>>>I spent some time investigating how to speed up the new demangler,
> >>>>because it turns out to affect gdb startup time on C++ executables.
> >>>
> >>>I don't have anything bad to say about your patch, but I do think
> >>>that
> >>>the behavior you mention here is a bug in GDB. I noticed that ages
> >>>ago,
> >>>but have never done anything about it. GDB should lazily demangle
> >>>names; that would be much more scalable. In fact, I would keep an
> >>>LRU
> >>>cache of a few thousand such names; in a very big program, the memory
> >>>required to demangle all of the names may well be prohibitive.
> >>
> >>The issue seems to be a little more complex than that. Given "break
> >>foo", we need to find all the methods whose base name is foo - the
> >>most
> >>apparent way to handle the existing symbol table does not lend itself
> >>to a query of that sort without either demangling, or substantial
> >>hacks
> >>for each supported mangling, to guarantee that _Z3foo1a and _Z3foo1b
> >>hash to nearby locations.
> >
> >Uh, this is one of the things search trees excel at. libiberty has
> >an implementation of ternary search tree. Try it.
> >
> >You can simply insert all names, and map them to a linked list of
> >symbols with that name, and then use that list to see which ones are
> >approriate for the given context.
>
> I forgot to add the important part for mangled names:
> In addition, you can simply do a pattern search. Ternary search trees
> lend themselves to "a*b" type searches as well. So for "foo::bar" you
> just search for "*foo*bar" in the mangled name TST (or is that
> *bar*foo").
>
> I never got around to implementing the pattern searches for the TST,
> but i guess i could if it's useful.
>
> If you want it even faster, you could simply build a search tree based
> on an understanding of our mangling format so that we index all symbols
> containing given names on that name.
>
> IE when we see 3foo, we insert that symbol (regardless of the rest of
> the name) in the list attached to the tree for "foo".
>
> Then for foo::bar, you can simply do a search on foo, a search on bar,
> take the intersection, and go from there.
>
> This should trim the number of things you have to demangle.
OK, now I think I see where you're going with this. It's *foo*bar for
v3 and bar*foo* for v2, by the way. Your new suggestion is more
practical and matches what Ian's suggesting. I'd like to thank all
three of you for the suggestions; I'm working on a more immediate use
of mangled/demangled names in GDB (trying to obsolete
DW_AT_MIPS_linkage_name, which the new DW_AT_namespace support in GCC
will finally let us do) but after that's done I'll be back to this
problem.
Unless someone else wants to work on it of course :)
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer