This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: C++ inlining heuristics changed (3.0.1)


On Fri, Aug 24, 2001 at 12:49:10PM +0200, Gerald Pfeifer wrote:
> As promised, updated with data for your second patch.
> 
> First the build benchmarks, then run-time benchmarks:
[...]
> Summary: Still worse than 2.95.3 in most cases, but significantly better
> than 3.0.1 (which in turn is quite a bit worse when compared to 3.0). The
> second patch is an improvement over the first one.

OK, here is v3 of the patch, with only slight adjustments:
The linear function to limit the recursive inlining now has only half the
slope. (-1/32). Anf when accounting, we correctly substract 1 because we
save the call tree statement, so we do single statement inlines at no cost
for the recursive inlining accounting. (As far as I could see, we save just
one tree statement, but I might have overseen something.)

This should not change much, except that it should yield better behaviour
for some cases that were throttled a bit too much when inlining recursively.
(I'd guess this happens for the expression template stuff.)

I would also be interested how this patch performs when changing
-finline-limit. In my tests, performance starts to drop significantly
somewhere between 400 and 350. It might be, that for some of your tests,
1000, or 1500 can yield better results than the default.

The docu is now adapted in the .texi file, as it should.

> I'm still waiting for approval to commit the first patch (which doesn't
> require the paperwork) on both branches.

Did it happen meanwhile?

Patch v3 attached and available via
http://www.garloff.de/kurt/freesoft/gcc/

Regards,
-- 
Kurt Garloff                   <kurt@garloff.de>         [Eindhoven, NL]
Physics: Plasma simulations  <K.Garloff@Phys.TUE.NL>  [TU Eindhoven, NL]
Linux: SCSI, Security          <garloff@suse.de>    [SuSE Nuernberg, DE]
 (See mail header or public key servers for PGP2 and GPG public keys.)
--- gcc/cp/ChangeLog.orig	Mon Aug 20 20:48:06 2001
+++ gcc/cp/ChangeLog	Wed Aug 29 23:26:21 2001
@@ -1,3 +1,10 @@
+2001-08-29  Kurt Garloff  <garloff@suse.de>
+
+	* optimize.c (inlinable_function_p): Change heuristics of inlining:
+	Rather than allow one single function to exhaust the limit,
+	allow only half way. Afterwards don't cut apruptly, but get
+	more and more restrictive until a minimum size.
+
 2001-08-19  Release Manager
 
 	* GCC 3.0.1 Released.
--- gcc/cp/optimize.c.orig	Thu Jun  7 00:50:22 2001
+++ gcc/cp/optimize.c	Wed Aug 29 16:25:41 2001
@@ -621,7 +621,23 @@
      inline_data *id;
 {
   int inlinable;
-
+  /* garloff@suse.de, 2001-08-22: The C++ inline throttling has
+   * bad side effects, sometimes, in our top-down inlining: we
+   * may end up inlining the trunk and not the leaves of the call tree,
+   * because we inlined too much before. Poor performance is the result.
+   * Real solution is bottom up inlining; here we just use a better
+   * heuristics: don't cut off inlining completely, but drop it off
+   * slowly. The further we are beyond the max limit the smaller the
+   * function needs to be to still get inlined.
+   */
+  int max_inline_single    = MAX_INLINE_INSNS/(2*INSNS_PER_STMT);
+  int max_inline_recursive = MAX_INLINE_INSNS/INSNS_PER_STMT;
+  /* Functions that small can always be inlined:
+   * We have to trade arg saving and the call and ret insns against 
+   * the function length itself, here. 13 has been found by experiments.
+   */
+  int min_inline = 13;
+  
   /* If we've already decided this function shouldn't be inlined,
      there's no need to check again.  */
   if (DECL_UNINLINABLE (fn))
@@ -641,7 +657,7 @@
   else if (varargs_function_p (fn))
     ;
   /* We can't inline functions that are too big.  */
-  else if (DECL_NUM_STMTS (fn) * INSNS_PER_STMT > MAX_INLINE_INSNS)
+  else if (DECL_NUM_STMTS (fn) > max_inline_single)
     ;
   /* All is well.  We can inline this function.  Traditionally, GCC
      has refused to inline functions using alloca, or functions whose
@@ -655,11 +671,21 @@
 
   /* Even if this function is not itself too big to inline, it might
      be that we've done so much inlining already that we don't want to
-     risk inlining any more.  */
-  if ((DECL_NUM_STMTS (fn) + id->inlined_stmts) * INSNS_PER_STMT 
-      > MAX_INLINE_INSNS)
+     risk inlining any more. 
+     Only if it's smaller than min_inline, we still go ahead, unless
+     we're really far beyond the recursive inlining limit. */
+  if (DECL_NUM_STMTS (fn) + id->inlined_stmts > max_inline_recursive * 128)
     inlinable = 0;
-
+  else if ( DECL_NUM_STMTS (fn) + id->inlined_stmts > max_inline_recursive
+	    && DECL_NUM_STMTS (fn) > min_inline ) {
+    /* Use a linear function with a slope of -0.03125
+     * we could also use an int approx. of sqrt or similar things */
+    signed int max_curr = max_inline_single
+	- ( DECL_NUM_STMTS (fn) + id->inlined_stmts
+	    - max_inline_recursive ) / 32;
+    if ((signed int)(DECL_NUM_STMTS (fn)) > max_curr)
+      inlinable = 0;
+  }
   /* We can inline a template instantiation only if it's fully
      instantiated.  */
   if (inlinable
@@ -899,7 +925,8 @@
 
   /* Our function now has more statements than it did before.  */
   DECL_NUM_STMTS (VARRAY_TREE (id->fns, 0)) += DECL_NUM_STMTS (fn);
-  id->inlined_stmts += DECL_NUM_STMTS (fn);
+  /* Substract one to account for saved space for arg setup and call/ret */
+  id->inlined_stmts += DECL_NUM_STMTS (fn) - 1;
 
   /* Recurse into the body of the just inlined function.  */
   expand_calls_inline (inlined_body, id);
--- gcc/cp/NEWS.orig	Wed Aug 22 20:37:43 2001
+++ gcc/cp/NEWS	Wed Aug 22 20:37:14 2001
@@ -1,3 +1,13 @@
+*** Changes in GCC 3.0.2:
+
+* The meaning of the -finline-limit changed for C++: We allow single
+  functions up to half the size. Once we have exhausted the size by
+  recursive inlining, we start to decrease the acceptable size for
+  inlining slowly up to until we're at 8 times the number. Then only
+  very small functions are still inlined.
+  Results in much improved C++ runtime performance (with about the
+  same compile time).
+
 *** Changes in GCC 3.0.1:
 
 * -fhonor-std and -fno-honor-std have been removed. -fno-honor-std was
--- gcc/doc/invoke.texi.orig	Mon Aug 20 20:48:10 2001
+++ gcc/doc/invoke.texi	Wed Aug 29 14:22:30 2001
@@ -3288,6 +3288,15 @@
 means slower programs).  This option is particularly useful for programs that
 use inlining heavily such as those based on recursive templates with C++.
 
+Note that for C++, the rules are slightly more complicated. The
+maximum size of a function that may be inlined is half of @var{n}. 
+For recursive inlining, we start to decrease the acceptable size for
+inlining once we have inlined @var{n} pseudo instructions until we reach
+many times @var{n}. Then only very small functions may still be inlined.
+Experimenting with this parameter may be useful to improve
+runtime performance (e.g. @var{n}=1500) or to decrease the size of the
+executable (e.g. @var{n}=260).
+
 @emph{Note:} pseudo instruction represents, in this particular context, an
 abstract measurement of function's size.  In no way, it represents a count
 of assembly instructions and as such its exact meaning might change from one

PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]