This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [4.0] Fix performance regressions due to inlining
- From: Mark Mitchell <mark at codesourcery dot com>
- To: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Fri, 04 Mar 2005 08:39:00 -0800
- Subject: Re: [PATCH] [4.0] Fix performance regressions due to inlining
- Organization: CodeSourcery, LLC
- References: <Pine.LNX.4.44.0503041201530.2297-500000@alwazn.tat.physik.uni-tuebingen.de>
Richard Guenther wrote:
patch one: ignore stores to DECL_IGNORED variables in estimate_num_insns
patch two: adjust inlining limits to 10% below the 3.4 values
patch three: make cost of CALL_EXPR depend on size of arguments
patch four: do not allow shrinking of unit size due to inlining
#1 is patches one and two
#2 is #1 + patch three
#3 is #1 + patch four
How do I need to proceed to get #2 considered for 4.0? Can someone
throw SPEC at that combination? Does anyone have stuff like PR8361
that includes _run_-time testing?
I don't understand patch one. (Not "#1", "patch one".) Why don't
stores to DECL_IGNORED variables matter? I understand why this helps
certain codes, but it seems like a hack, rather than a conceptually
sound change. I understand that you're trying to avoid counting nodes
that won't actually result in code generation, but I don't think this
approach is sound. Do you really want to be checking that the location
being written is a RESULT_DECL? (I'm not 100% convinced that would be
right, either, but it would make more sense to me, in that at least we
know it's the return value, and so we may be able to avoid the copy.)
Or, do you want to ignore all TARGET_EXPR initializations, on the
grounds that typically the initialization of the TARGET_EXPR is really
initializing the object to which the TARGET_EXPR is assigned?
Patch two is just tuning; if you can demonstrate good results (and
you're doing the right kind of testing, in my opinion) that seems OK.
Patch three seems very sensible.
Patch four doesn't seem very well motivated, as you've noted yourself.
I'm willing to consider these sorts of patches (appropriately
benchmarked of course) for 4.0, as they all look quite safe from a
correctness point of view. But, I think that we should be trying to
develop sound heuristics, rather than just "this works". At the very
least, if you want patch one to be considered, you'll need to write an
extensive comment motivating the heuristics you're using.
--
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304