Code Bloat g++

Per Bothner
Sun Feb 20 11:35:00 GMT 2000

"Martin v. Loewis" <> writes:

> I don't know what kind of comments you were expecting. You said
> # It seems like if we are going to do pre-compiled headers
> and I thought "so perhaps not this year".

Didn't Zack recently announce he had been hired (by Cygnus) to work on
on pre-compiled headers?  If so, I think it makes sense to at least
think about the debug-symbol issue at the same time.

> On your #pragma stuff, I think there is a number of problems with
> it. First, the idea of compilation repositories does not really work
> in my experience. I quite like the 'filter' mode of the compiler: one
> input file, one output file. Any additional outputs will produce a lot
> of pain, starting with getting the Makefiles right.

Well, any time you do pre-compiled headers, you need to so some magic,
to actually pre-compile the headers.  So whatever program/script is
used to generate pre-compiled headers can *at the same time* compile
the debug information.  I'm not saying we need to use the same
*machanism* to solve the two problems of speeding up compilation and
reducing exxcess debug info size;  however, the two should be done
at the same time, using a single command.

> Then, I think the problem with extensive debugging information is
> partially a misperception on the user side, and the rest has little to
> do with header files. *If* there is really a lot of overhead in debug
> information, it comes from the .cc files, not from the .h files.

That is not what I am hearing from Joe and others.  A large part of
the problem is that the *same* or (similar) debug information from
on header file is repeated in many .o files.

> Inline functions may indeed contribute to the large debugging
> information. However, this is not due to duplicate out-of-line
> instantiations across translation units,

That does cause wasteful duplication of both code and debugging
information.  I don't have a feel for how significant the size
of out-of-line instantiations is, but it is worth avoiding
duplicates.  If the user is to be able to call inline functions
from the debugger, then the compiler needs to generate at least
one out-of-line copy, at least with current gdb technlogy.  This
becomes expensive if we get many out-of-line copies;  my idea
makes it easy to generate just a single copy, at least for
non-template functions.  For template inline functions, we can
combine my idea with whatever model we useing to avoid
duplicate copies for template functions.

> but because of duplicate inline copies. You cannot share debug
> information for the inline copies, since they have different
> register and stack usage every time they are inlined.

Yes, my idea won't help there.

> Pretty much the same holds for template functions. Debugging
> information is generated for instantiations, not for the template
> themselves.

I was more speaking to the issue of debugging information for the data
structure definitions, specifically template class definitions.  These
should be emitted as debug information at the template level, not
instatiated classes.  Perhaps you're right in believing that the issue
of debug information for classes is minor compared to that of debug
info for inline expansions.  Maybe the people who are having problems
with debug info size should try to estimate what kind of debug info
is causing most problems.

> Also, you didn't indicate who you would implement your approach -

If the people who are hurting from large debug size care enough
about the problem, they should either do it or pay someone to do it.

> which, I think, is more difficult than writing it down.

Implementation may take more time than coming up with a good design
(which I don't claim my idea is), but the latter is more difficult in
that it takes people with more skill and experience.  (However, with
Gcc we're in the situation that the compiler is sufficiently
complicated that even implementing an existing design requires relatively
experienced people, which is one reason we don't have as much delegation
as we should.)
	--Per Bothner

More information about the Gcc mailing list