This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Inlining heuristics for C++

To: Fergus Henderson <fjh at cs dot mu dot oz dot au>
Subject: Re: Inlining heuristics for C++
From: Daniel Berlin <dan at cgsoftware dot com>
Date: Mon, 09 Jul 2001 23:18:38 -0400
Cc: Daniel Berlin <dan at cgsoftware dot com>, gcc at gcc dot gnu dot org
References: <87r8vpo8rw.fsf@cgsoftware.com><20010710122244.A14584@hg.cs.mu.oz.au>

Fergus Henderson <fjh@cs.mu.oz.au> writes:

> On 09-Jul-2001, Daniel Berlin <dan@cgsoftware.com> wrote:
>> I've got a very simple heuristic that says "don't inline things <some
>> configurable number currently defaulting to 10> times bigger than we
>> were when we started inlining, into us" I.E. don't inline a 100
>> statement function into a 10 statement one.
>> 
>> This is effectively expressing the rule: "Small functions should be
>> inlined into larger ones.  Larger functions should not be inlined into
>> small ones".
> 
> There is an exception to this.  Functions which are only called once,
> and which are defined in this translation unit and not exported, should
> almost always be inlined (after which the original copy of the function
> is dead code and can be eliminated), even if they are much larger than
> the function into which they are being inlined.

They will be, if some other, larger function, calls that smaller
function.

> 
> For this to work, you need to do parse and analyze the whole translation
> unit before emitting code, to get information about call counts, and
> to enable elimination of dead functions.
Correct.
But this isn't as costly as you think in terms of memory, and trees we
can store to disk if we really had to (precompiled headers, at least
the way redhat did them, does this, in fact)
My memory usage using defer_fn instead of expand_body, in parse.y, and
a few other places, only increased an average of 25%.
It's not the AST that is expensive, and if it was, that's not hard to
fix. 
> 
> (Is compiling whole translation units at a time, rather than compiling
> each function in turn, what you meant by "region-based compilation"?
> 
Almost.
That involves a ton of memory and compile time.
To quote the abstract for tha paper "Region based compilation: An
introduction and motivation"
"
   As the amount of instruction-level parallelism required to fully
   utilize VLIW and superscalar processors increases,
   compilers must perform increasingly more aggressive analysis,
   optimization, parallelization and scheduling on the input
   programs. Traditionally compilers have been built assuming
   functions as the unit of compilation, making compiler
   performance heavily dependent upon the contents of the
   function. The function-based partitioning tends to hide valuable
   optimization opportunities form the compiler making it
   undesirable. Function inlining may be applied to assemble
   inter-procedurally coupled functions into the same compilation unit
   at the cost of very large function bodies. This paper
   introduces a new technique called region-based compilation where
   the compiler is allowed to repartition the program into
   more desirable compilation units. Region-based compilation units
   allow the compiler to control problem size while exposing
   inter-procedural optimization and code motion opportunities.
"
Which is pretty much exactly the problem we've run into.
Except, unlike straight, flat, interprocedural analysis applied to an
entire program, regions give you comparable performance, and with
demand driven region formation, reduced memory usage and reduced
compilation times. And it would be easier to modify gcc to do region
based stuff than interprocedural stuff (because we have a large dependency on single function at a time)

The best part is that papers have been showing solid experimental evidence (Hank, Hwu,
Rau) that regions are often smaller than functions, so we'd probably
actually get *faster* at optimizing.

A good paper on the topic that is pretty easy to read, but has a long
title, is: Region Formation Analysis with Demand-driven Inlining for
Region-based Optimization
http://www.eecis.udel.edu/~way/papers/pact2000.ps.gz


-- 
"I invented the cordless extension cord.
"-Steven Wright

References:
- Inlining heuristics for C++
  - From: Daniel Berlin
- Re: Inlining heuristics for C++
  - From: Fergus Henderson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]