This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!


Daniel Berlin wrote:
On 8/31/06, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
Mark Mitchell wrote:
> Kenneth Zadeck wrote:
>
>> Even if we decide that we are going to process all of the functions in
>> one file at one time, we still have to have access to the functions that
>> are going to be inlined into the function being compiled. Getting at
>> those functions that are going to be inlined is where the double the i/o
>> arguement comes from.
>
> I understand -- but it's natural to expect that those functions will
> be clumped together. In a gigantic program, I expect there are going
> to be clumps of tightly connected object files, with relatively few
> connections between the clumps. So, you're likely to get good cache
> behavior for any per-object-file specific data that you need to access.
>
I just do not know. I assume that you are right, that there is some
clumping. But I am just no sure.

I just want to point out that this argument (okay cache locality) was used as a reason the massive amount of open/seek/close behavior by Subversion's FSFS filesystem is "a-ok".

Here, we won't be making syscalls -- but we will be taking page faults if we go out of cache. I don't know what the consequences of page faults for files backed over NFS are, but if your object files are coming over NFS, your linker isn't going to go too fast anyhow. I would expect most users carefully use local disk for object files.


Since we're descending into increasingly general arguments, let me say it more generally: we're optimizing before we've fully profiled. Kenny had a very interesting datapoint: that abbreviation tables tended to be about the size of a function. That's great information. All I'm suggesting is that this data doesn't necessarily imply that enabling random access to functions (as we all agree is necessary) implies a 2x I/O cost. It's only a 2x I/O cost if every time you need to go look at a function the abbreviation table has been paged out.

I think we've gotten extremely academic here. As far as I can tell, Kenny has decided not to use DWARF, and nobody's trying to argue that he should, so we should probably just move on. My purpose in raising a few counterpoints is just to make sure that we're not overlooking anything obvious in favor of DWARF; since Kenny's already got that code written, it would be nice if we had a good reason not to start over.

--
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]