This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Removing duplicate DWARF2 info


Mike Stump <mrs@windriver.com> writes:

> > Date: Mon, 10 Jul 2000 23:54:33 -0400 (EDT) From: Daniel Berlin
> ><dberlin@redhat.com>
> 
> > I know this has been hashed over before, but nobody ever seems to do it
> >where it needs to be done (I do it in GDB when we load the stuff), and i'm
> >guessing it's because there are three people on the planet with the knowledge
> 
> >of how LD works that can make this happen.
> 
> If you did it the right way, in bfd, and if bfd handled all things symbol,
> then enabling ld to make use of it also should be easy for you...  But alas, I
> 
> fear it wasn't quite done like this.


I can make BFD read and write DWARF2.
The problem is that removing info once written requires a *lot* of recalculation and rewriting.

Links already take forever because they have to deal with so much duplicate info. If jason can make LD not become an order of magnitude slower with his scheme, i'd be very impressed. It's just going to make things slower than they are already, anyway.
The ideal solution is to not write the info in the first place.



> 
> > Why can't we simply not emit the duplicate info, taking LD out of the
> >picture completely.  All it seems this would entail is keeping track of what
> >we've emitted info for, over the course of more than one file.  Couldn't we
> >have a simple persistent hash table in a file, and do a lookup in that, then
> >do the check to see if we emitted it during this compilation?  That doesn't
> >really seem that tricky to do, or am i missing something?
> 
> I often argue for a generic along side database.
I've seen that discussion (I've been lurking for years) quite a few times.
I've always thought the benefits outweigh the disadvantages, but I seem to be in the minority.

>   The repo database almost
> fits that role.  With such a beast, quite a few things become trivial.  I
> don't happen to think you're missing too much, assuming you already know about
> 
> things like usage of .o files in more than one executable/library.

Yeah, but by answer is that who cares if it's not absolutely perfect.
It would never produce incorrect info.
Let's say we miss 3% of the duplicate info.
We've still removed 97%. of it.
I know of people who have >1 gig of debug info, almost all of it duplicate info.
So we could only remove 993 meg of duplicate info, instead of 1023.99 meg.
Somehow, i think people would still be ecstatic.
Also, if you play the "do it in the linker" game, you still have to process all one gig, and what you do is remove 97% of what is there.

> 
> Profile base feedback for branch prediction I would argue wants it, improving
> compilation speeds for C++ wants it (think in part precompiled headers),
> debugging dups want it, templates want it...

You also missed whole program and inter-file optimizations that would be trivial given a generic database.


> 
> > From: Zack Weinberg <zack@wolery.cumb.org> Date: Mon, 10 Jul 2000 21:06:20
> >-0700 To: Daniel Berlin <dberlin@redhat.com>
> 
> > Do you want to teach every makefile in existence about this persistent hash
> >table?
> 
> Since the repo database could in fact be used to do exactly what is requested,
> 
> and since I have in fact not seen a lot of repo stuff in Makefiles, I'd think
> this is a slight overstatement of the reality of the situation, though,
> without 5 years of experience on a new system, I do agree, peering into the
> future is at times, hard.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]