This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH RFC: Patches to speed up the demangler


On Sun, Dec 28, 2003 at 04:01:51PM -0500, Daniel Berlin wrote:
> 
> On Dec 28, 2003, at 3:56 PM, Daniel Berlin wrote:
> 
> >
> >On Dec 28, 2003, at 2:47 PM, Daniel Jacobowitz wrote:
> >
> >>On Sun, Dec 28, 2003 at 10:56:43AM -0800, Mark Mitchell wrote:
> >>>On Sat, 2003-12-20 at 12:16, Ian Lance Taylor wrote:
> >>>>I spent some time investigating how to speed up the new demangler,
> >>>>because it turns out to affect gdb startup time on C++ executables.
> >>>
> >>>I don't have anything bad to say about your patch, but I do think 
> >>>that
> >>>the behavior you mention here is a bug in GDB.  I noticed that ages 
> >>>ago,
> >>>but have never done anything about it.  GDB should lazily demangle
> >>>names; that would be much more scalable.  In fact, I would keep an 
> >>>LRU
> >>>cache of a few thousand such names; in a very big program, the memory
> >>>required to demangle all of the names may well be prohibitive.
> >>
> >>The issue seems to be a little more complex than that.  Given "break
> >>foo", we need to find all the methods whose base name is foo - the 
> >>most
> >>apparent way to handle the existing symbol table does not lend itself
> >>to a query of that sort without either demangling, or substantial 
> >>hacks
> >>for each supported mangling, to guarantee that _Z3foo1a and _Z3foo1b
> >>hash to nearby locations.
> >
> >Uh, this is one of the things  search trees excel at.  libiberty has 
> >an implementation of ternary search tree. Try it.
> >
> >You can simply insert all names, and map them to a linked list of 
> >symbols with that name, and then use that list to see which ones are 
> >approriate for the given context.
> 
> I forgot to add the important part for mangled names:
> In addition, you can simply do a pattern search.  Ternary search trees 
> lend themselves to "a*b" type searches as well. So for "foo::bar" you 
> just search for "*foo*bar" in the mangled name TST (or is that 
> *bar*foo").
> 
> I never got around to implementing the pattern searches for the TST, 
> but i guess i could if it's useful.
> 
> If you want it even faster, you could simply build a search tree based 
> on an understanding of our mangling format so that we index all symbols 
> containing given names on that name.
> 
> IE when we see 3foo, we insert that symbol (regardless of the rest of 
> the name) in the list attached to the tree for "foo".
> 
> Then for foo::bar, you can simply do a search on foo, a search on bar, 
> take the intersection, and go from there.
> 
> This should trim the number of things you have to demangle.

OK, now I think I see where you're going with this.  It's *foo*bar for
v3 and bar*foo* for v2, by the way.  Your new suggestion is more
practical and matches what Ian's suggesting.  I'd like to thank all
three of you for the suggestions; I'm working on a more immediate use
of mangled/demangled names in GDB (trying to obsolete
DW_AT_MIPS_linkage_name, which the new DW_AT_namespace support in GCC
will finally let us do) but after that's done I'll be back to this
problem.

Unless someone else wants to work on it of course :)

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]