This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

*To*: Nathan Sidwell <nathan at codesourcery dot com>*Subject*: Re: [PATCH] New c++ inliner*From*: Nathan Sidwell <nathan at acm dot org>*Date*: Mon, 16 Jul 2001 15:41:04 +0100*CC*: gcc-patches at gcc dot gnu dot org, Gerald Pfeifer <pfeifer at dbai dot tuwien dot ac dot at>, mark at codesourcery dot com, Daniel Berlin <dan at cgsoftware dot com>, Jason Merrill <jason_merrill at redhat dot com>*References*: <3B4F1363.4BB4F5C5@codesourcery.com>*Reply-To*: nathan at compsci dot bristol dot ac dot uk

Hi, here are the numbers I gathered from Gerald's program. I'm attaching a tar gzipped set of postscript graphs. These are original-N.ps and new-N.ps for N = 0..5 original == gcc 3.1 new == gcc-3.1 + inliner patch N == 0 is executable size N == 1..4 execution time All code was compiled with -O2 -finline-functions -static. The environment was a 600MHz P3 with 192MB memory running 6.2 Redhat Gnu/linux. I did not recompile the libraries when varying the inlining parameters. For the original graphs I varied the -finline-limit parameter from 64 to 8192. I also established a no-inlining case with -finline-limit=0 For the new graphs I varied both --param max-inline-ast over the set [0, 3, 6, 12, 25, 50, 100, 200, 400] and --param arg-inline-ast over the set [0, 1, 2, 4, 8, 16, 32, 64]. The no inlining case is with both those at zero. All graphs are normalized against the no-inlining case. All graphs are logarithmic along X for the 2D graphs and X&Y for the 3D graphs. I mapped zero onto -1. The scaling between -finline-limit and --param max-inline-ast is 10 to 1 (i.e -finline-limit=10 is equivalent to --param max-inline-ast=1). All the 2D graphs have a straight line from (-1,1) to (6,y). That's because I did not measure any other settings in that range. The curves look stable enough to see that nothing interesting occurs there. The execution times are the times for various input data, and different code paths are exercised for each test. It is a DLV is a computational logic/logic programming system and more information is available at http://www.dbai.tuwien.ac.at/proj/dlv/. (The source is not freely available, though.) I'm sure Gerald will answer questions about what each problem each one is solving. You'll notice that original-0.ps (the original size graph), has a dip at about 2^8. Beyond that the size increases again. The execution speeds show a minimum reached at about 2^10, in some cases getting worse with higher numbers, and not in others. The new graphs show a size dip at about (2^5, 2^4), however there is a reasonably flat bottomed valley forming an arc around the origin. The execution times all show a flat plateau with a circular peak centered around the origin. comparing absolute size & speed, here are the numbers O2, no-inline O2 O2 inline-functions gcc 2.95 3890 3544 3530 (size in K) 9.77 4.99 5.20 stratcomp1-all 100 19.4 19.4 primeimpl2 gcc 3.1 4500 5048 6786, (size in K) 72.2 55.3 54.7 stratcomp1-all 214 23.4 27.7 primeimpl2 3.1+patch 4495 3509 (size in K, best) 75.9 55.0 stratcomp1-all, best 218 18.3 primeimpl2, best you'll notice that 3.1 is significantly slower that 2.95, I don't know if this is caused by the difference in the stl's or not (or anything like IO sync degradation) The execution speeds with the patch are not different to that without it, but the size is better. Mark Mitchell wrote: > > 3) fully optimize each of those callees. This includes the > > step of inlining into the body of the callee. There is a new > > flag for FUNCTION_DECLS indicating whether it has already under > > gone this optimization step. > > I am concerned about this. In particular, the reason we did top-down > inlining was that if you have A calls B calls C calls D and all > are inline, then you get four copies of D, three of C, two of B, i.e., > a quadratic-ish explosion in copying. By not inlining into B (unless it > is output directly) we eliminate this problem. I understand your > response to this issue, but I'm still concerned. Ok, I've thought of a way round this, Rather than complete step3, we can generate the inlinable call-site list, optimize those callees (to the same level as I'm about to describe), then estimate what the expanded size would be by summing the sizes of the inlinable functions. When we do the actual inlining, we copy the callee + its inlining plan, then continue much as the current inliner does. This will give us bottom-up inlining, without the memory expansion you're worried about. Our estimate of the expanded size will come adrift if inlining introduces eliminable unreachable code. You might well be right about the DECL_NUM_STMT's estimate, though if we have unreachable code elimination, constant propagation and a constant parameter, our estimate might start getting inaccurate. Jason Merrill wrote: > Thoughts: Yup, all things which should be on the todo list. Anyway, can I put this one in? or would you prefer a version with the above described expanding first? nathan -- Dr Nathan Sidwell :: Computer Science Department :: Bristol University Never hand someone a gun unless you are sure where they will point it nathan@acm.org http://www.cs.bris.ac.uk/~nathan/ nathan@cs.bris.ac.uk

**References**:**[PATCH] New c++ inliner***From:*Nathan Sidwell

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |