This is the mail archive of the
mailing list for the GCC project.
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
- From: Kenneth Zadeck <zadeck at naturalbridge dot com>
- To: Mark Mitchell <mark at codesourcery dot com>
- Cc: GCC <gcc at gcc dot gnu dot org>, "Berlin, Daniel" <dberlin at dberlin dot org>, "Hubicha, Jan" <jh at suse dot cz>, "Novillo, Diego" <dnovillo at redhat dot com>, Ian Lance Taylor <ian at airs dot com>, "Edelsohn, David" <dje at watson dot ibm dot com>
- Date: Wed, 30 Aug 2006 19:06:31 -0400
- Subject: Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
- References: <44F2F642.firstname.lastname@example.org> <44F606CD.email@example.com>
Mark Mitchell wrote:
> Kenneth Zadeck wrote:
>> will be more cumbersome if we have to keep reloading each object
>> file's abbrev table just to tear apart a single function in that .o
>> file. While the abbrev sections average slightly less than %2 of the
>> of the size of the GIMPLE encoding for an entire file, each abbrev table
>> averages about the same size as a single function.
> Interesting datapoint.
> (Implied, but not stated, in your mail is the fact that the
> abbreviation table cannot be indexed directly. If it could be, then
> you wouldn't have to read the entire abbreviation table for each
> function; you would just read the referenced abbreviations. Because
> the abbreviation table records are of variable length, it is indeed
> true that you cannot make random accesses to the table. So, this
> paragraph is just fleshing out your argument.)
> I think the conclusion that you reach (that the size of the tables is
> a problem) depends on how you expect the compiler to process functions
> at link-time. My expectation was that you would form a global
> control-flow graph for the entire program (by reading CFG data encoded
> in each .o file), eliminate unreachable functions, and then
> inline/optimize functions one-at-a-time.
> If you sort the function-reading so that you prefer to read functions
> from the same object file in order, then I would expect that you would
> considerably reduce the impact of reading the abbreviation tables.
> I'm making the assumption that it f calls N functions, then they
> probably come from < N object files. I have no data to back up that
> (There is nothing that says that you can only have one abbreviation
> table for all functions. You can equally well have one abbreviation
> table per function. In that mode, you trade space (more abbreviation
> tables, and the same abbreviation appearing in multiple tables)
> against the fact that you now only need to read the abbreviation
> tables you need. I'm not claiming this is a good idea.)
> I don't find this particular argument (that the abbreviation tables
> will double file I/O) very convincing. I don't think it's likely that
> the problem we're going to have with LTO is running out of *virtual*
> memory, especially as 64-bit hardware becomes nearly universal. The
> problem is going to be running out of physical memory, and thereby
> paging like crazy, running out of D-cache. So, I'd assume you'd just
> read the tables as-needed, and never both discarding them. As long as
> there is reasonable locality of reference to abbreviation tables
> (i.e., you can arrange to hit object files in groups), then the cost
> here doesn't seem like it would be very big.
Even if we decide that we are going to process all of the functions in
one file at one time, we still have to have access to the functions that
are going to be inlined into the function being compiled. Getting at
those functions that are going to be inlined is where the double the i/o
arguement comes from.
I have never depended on the kindness of strangers or the virtues of
virtual memory. I fear the size of the virtual memory when we go to
compile really large programs.
>> 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT.
> I never imagined you doing this; as per above, I always expected that
> you would use DWARF tags for the expression nodes. I agree that the
> stack-machine is ill-suited.
>> 3) THERE IS NO COMPRESSION IN DWARF3.
>> In 1 file per mode, zlib -9 compression is almost 6:1. In 1 function
>> per mode, zlib -9 compression averages about 3:1.
> In my opinion, if you considered DWARF + zlib to be satisfactory, then
> I think that would be fine. For LTO, we're allowed to do whatever we
> want. I feel the same about your confession that you invented a new
> record form; if DWARF + extensions is a suitable format, that's fine.
> In other words, in principle, using a somewhat non-standard variant of
> DWARF for LTO doesn't seem evil to me -- if that met our needs.
One of the comments that was made by a person on the dwarf committee is
that the abbrev tables really can be used for compression. If you have
information that is really common to a bunch of records, you can build
an abbrev entry with the common info in it.
I have not seen a place where any use can be made of this for encoding
gimple except for a couple of places where I have encoded a true or
false. I therefor really do not see that they really add anything
except for the code to read and write them.
>> 2) LOCAL DECLARATIONS
>> Mark was going to do all of the types and all of the declarations.
>> His plan was to use the existing DWARF3 and enhance it where it was
>> necessary eventually replacing the GCC type trees with direct
>> references to the DWARF3 symbol table.
> > The types and global variables are likely OK, or at least Mark
>> should be able to add any missing info.
I had a discussion on chat today with drow and he indicated that you
were busily adding all of the missing stuff here. I told him that I
thought this was fine as long as there is not a temporal drift in
information encoded for the types and decls between the time I write my
stuff and when the types and decls are written.
> Yes, I agree that if you're not using DWARF for the function bodies,
> you probably want your own encoding for the local variables.
>> We will also need to add other structures to the object files. We
>> will need to have a version of the cgraph, in a separate section, that
>> is in a form so that all of the cgraphs from all of the object files
>> can be read a processed without looking at the actual function bodies.
>> function only calls other pure functions and so on... If we simply
>> label the call graph with the locally pure and locally constant
>> attributes, the closure phase can be done for all of the functions in
>> the LTO compilation without having to reprocess their bodies.
>> Virtually all inteprocedural optimizations, including aliasing, can
>> and must be structured this way.
> You could also label the function declarations. There's a decision to
> make here as to whether the nodes of the call graph are the same as
> the DWARF nodes for the functions themselves, or are instead separate
> entities (which, of course, point to those DWARF nodes). It would be
> nice, a priori, to have this information in the DWARF nodes because it
> would allow the debugger to show this information to users and to view
> it via DWARF readers. However, I can also imagine that it needs to be
> in the separate call graph.
>> I have not done this because I do not rule the earth. That was not
>> what I was assigned to do, and I agreed that DWARF3 sounded like a
>> reasonable way to go. Now that I understand the details of DWARF3, I
>> have changed my mind about the correct direction. Now is the time to
>> make that change before there is a lot of infrastructure built that
>> assumes the DWARF3 encoding.
> I think it's great that you're asking for feedback. My only feedback
> is that you may not need to make this decision *now*. We could
> conceivably wire this up, work on the other things (CFG, etc.) and
> return to the encoding issue. I'm vaguely in favor of that plan, just
> in that I'm eager to actually see us make something work. On the
> other hand, building up DWARF reading for this code only to chuck it
> later does seem wasteful. But, the DWARF reader is already there;
> it's mostly filling in some blanks. But, filling in blanks is always
> harder than one expects. So, I think this should really be your call:
> rework the format now, or later, as you think best.