This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Sigh. Inlining heuristics.


On Tue, Jul 10, 2001 at 01:17:35PM -0700, Mark Mitchell wrote:
> >> I suggest we drop it to 100 or 1000 instructions by default, and see
> >> how that does.
> >
> > How about defaulting to 10^{1+n} for -On?
> 
> Sure.  It seems like we all agree the current number is too big; picking
> a better one is a matter of experimenting a little bit.  Your scheme
> seems as good a guess as any, to me.  My *guess* it will still actually
> pessimize code at -O3, though, just by blowing up the size of the
> program, but I don't really have good data.  We need to do some 
> measurements.

Is it known whether or not we are inside a loop at the moment this
decision is made by the compiler?  If so, then I'd like to see a different
number of instructions as limit for loops and non-loops.

Example:

void g(void)
{
  h();		// Don't inline because h > 20 instructions.
}

void f(void)
{
  g();		// Don't inline because g > 20 instructions.
  for(int i = 0; i < 2; ++i)
    g();	// Inline because g < 200 instruction.  Also recursively inline h() because h < 200 instructions.
}


++complexity;

Hmm, indeed this shows that the approach of HP is needed...
The size of 'g' depends on whether or not h() is inlined in g.

Hence, we'd need to do this in f():

1) note we are in a loop (set limit to 200 instructions).
2) is h < 200?  Yes.
   3) Consider G = g + h.
      is G < 200?  Yes : inline everything.
                    No : is g < 200?  Yes: inline g, but not h.
		                       No: call G inside f.

So I should have said:

void f(void)
{
  g();		// Don't inline because g > 20 instructions.
  for(int i = 0; i < 2; ++i)
    g();	// Inline because g+h < 200 instruction.  Also recursively inline h() because h < 200 instructions.
}

OR

void f(void)
{
  g();		// Don't inline because g > 20 instructions.
  for(int i = 0; i < 2; ++i)
    g();	// Inline g because g < 200, but not h because g+h > 200.
}

[ Note that this is better than calling G: there is a chance that a conditional
  reduces the number of calls to h ].


++complexity;

However, the *number* of loops is important too.
When h < 200, g < 200 and h+g = G > 200 but h is called from within a loop inside g
the picture would change:

void g(void)
{
  for(int i = 0; i < 100; ++i)
    h();		// Inline because h < 200 instructions.
}

void f(void)
{
  g();		// Don't inline because g > 20 instructions.
  for(int i = 0; i < 2; ++i)
    g();	// Call G because G > 200.  Don't split up G because h is called from
                // within an additional loop inside g.
}


[ again: ==> is a call, and --> is an inlined 'call' ]
Ignoring recursive function calls, consider a chain of calls:

 a ==> b ==> c ==> d ==> e ==> f

Each function has a size of its own that we also call a, b, c, d, e, f.
Each call is locally inside a loop or not.  Lets assume three loops here:

 a ==> b =L=> c =L=> d ==> e =L=> f

Then inline decisions have to be made right to left based on the
actual inlined code size, where the size that is allowed is dependend
on the number of loops (and optimization level if you want).

For example, the additional loop  e =L=> f  makes it likely we
want to inline f into e rather than e into d and NOT f into e:

  ...> d ==> e -L-> f   is better than
  ...> d --> e =L=> f

[ while

  ...> d --> e ==> f   is better than
  ...> d ==> e --> f

  assuming that d --> e --> f, is not considered because e+f > 200.
]

So, if each function would be 150, we'd end up with:

 a --> b =L=> c -L-> d ==> e -L-> f
                                 150
                          300-------  <-- size of E = e+f
                    150
             300-------
      150
300------

which is of "order" 2L (right most call is two
loops deep).

OR

 a ==> b -L-> c =L=> d ==> e -L-> f
                                 150
                          300-------
                    150
             150
      300-------
150

which is "order" 2L too, but has two calls two loops
deep instead of 1.  Hence, the first is better.


++complexity;

Now include recursive calls :).

-- 
Carlo Wood <carlo@alinoe.com>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]