This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Code Bloat g++


Joe Buck <jbuck@possibly.synopsys.com> writes:

> There are a number of sources of redundancy.  The basic issue is that many
> .o files include the same classes and thus have redundant debug
> information.  Yes, the linker can eliminate duplicates, but all those .o
> files have to be generated, processed by the linker, and transferred
> across the network.

It seems like if we are going to do pre-compiled headers, ir might
make sense to try to do it in a way that solves the debug problem as
well.  I.e. actually compile each header file foo.h to a foo.o file.  That
foo.o file contains the debug information foo.h.  No other .o needs
to contain debug information for foo.h;  just genenerate a symbol
such that foo.o ges pulled in.

> One way to think of how to handle this is to use similar techniques that
> are used to avoid redundant virtual function tables: generate the detailed
> debug information for a class only in the .o file that implements the
> first non-inlined, virtual method.  g++ already does some tricks like
> that.  But that doesn't help with STL debug symbols since those classes
> are non-virtual.

The solution: resurrect the old #pragma interface / #pragma implementation
convention.  The problem:  You don't want to force people to edit tons
of header files, especially in files they get from third parties.
The solution:  A utility/compiler mode that does it for you.

You add a compiler mode that means "compile this .h file in "#pragma
implementation"-mode but all other header files in "#pragma interface"-
mode.  This can be part of the process of "pre-compiling header
files", but need not be, You set up a convention or database so that
normal compiles can figure out which header file should be compiled in
"implicit-#pragma-interface"-mode.

So compiling a large project is done in two phases:  First you compile
the header files in #pragma-implementation-mode.  That populates some
database (which can be as simple as a conventional sub-directory).
The second phase compiles the .c files, with a flag that tells
the compiler to check the database(s) from the first phase.  A header
file is compiled in "implicit-#pragma-interface"-mode iff it is
in the database.  In that case, no debug information or inline
expansions are generated for that header file.  Instead, the
compiler generates an external reference (to an absoluete symbol)
that will cause teh linker to link in the .o file compiled in the
first phase.

Viola!  No duplicate debug symbols or inlines.

I don't really know the current preferred model for template
compilation and debugging.  My guess is that debug information
is generated for a template *instantiation*.  That makes it
harder to get the full payoff from my proposal.  The "correct"
solution is that the debug information should describe the
*un-instantiated* templates, and the debugger should expand
the templates.  This may be a lot of work, involving extending
the debug format, and adding a template explander to gdb.  (The
latter doesn't have to be 100% correct, though that would be
nice - gdb already has a number of "95% correct" features.)

Note that this model is quite compatible with using incremental
compilation.  E.g.. you can write Makefiles so only the changed files
are re-compiled in each phases.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]