This is the mail archive of the
mailing list for the GCC project.
Re: reduce compilation times?
Brian Dessent wrote:
Tom St Denis wrote:Yeah, except putting all your functions in one file goes against the
very nature of proper software development strategies. First off, you
should be running a profiler anyways is performance is important. If
you're not, then you're not very well educated in the field of work.
What you really should do, is profile your code, then create "static
inline" or macro copies of heavily used (and not overly large) pieces of
code. And even then, inlining code doesn't always help.
You don't have to go to the trouble of inlining things manually, the
compiler can do a much better job of estimating whether that's
advantageous or not. Just mark functions that are not for export as
static and the compiler will now have a large range of optimizations
that it can automatically perform, including but not limited to inlining
them. This is a case where having support/helper functions in the same
.c file as the exportable functions that use them makes a great deal of
sense. The key word in the original statement was exportabe:
That aside, the profiler will tell you where time is spent. Yes, giving
the compiler the option to inline or not is "ideal" but putting 100K of
lines in a single file is not.
Often the savings, especially on desktop/server class processors from
the minutia of optimizations possible at that level do not out weigh the
cost to the development process.
This is why you should re-factor your code as to contain only one [or as
few as possible] exportable functions per unit.
In general the compiler can do the best job when it can see everything
at once, which is why currently so much work is being poured into
developing the LTO branch, which will allow the compiler do certain
optimizations as if the entire program was a single compilation unit
even though it was compiled separately.
For example, in my math library the modexp function calls an external
mul, sqr, and mod functions (well montgomery reduction but you get the
point). So even though they're not inlined (well they're big so they
wouldn't anyways) and you have the overhead of a call, the performance
is still 99% dominated by what happens inside the calls, not by the call
itself. In my case, my multipliers are fully unrolled/inlined since
that's where the performance is to be had. So it was worth the
readability cost (well they're machine generated anyways) for it.
I question the sanity of a LTO step (if indeed that means it
re-organizes the object code at link time). It'll make debugging harder
when supposedly non-inlined code gets inlined, or other nasties (e.g.
picking up a constant from another module then removing dead code, etc).
I think most people would prefer their object files to be representative
of the compiler input.