[PATCH] ipa-inline: Improve growth accumulation for recursive calls

Xionghu Luo luoxhu@linux.ibm.com
Wed Aug 12 10:19:14 GMT 2020


From: Xiong Hu Luo <luoxhu@linux.ibm.com>

For SPEC2017 exchange2, there is a large recursive functiondigits_2(function
size 1300) generates specialized node from digits_2.1 to digits_2.8 with added
build option:

--param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80

ipa-inline pass will consider inline these nodes called only once, but these
large functions inlined too deeply will cause serious register spill and
performance down as followed.

inlineA: brute (inline digits_2.1, 2.2, 2.3, 2.4) -> digits_2.5 (inline 2.6, 2.7, 2.8)
inlineB: digits_2.1 (inline digits_2.2, 2.3) -> call digits_2.4 (inline digits_2.5, 2.6) -> call digits_2.7 (inline 2.8)
inlineC: brute (inline digits_2) -> call 2.1 -> 2.2 (inline 2.3) -> 2.4 -> 2.5 -> 2.6 (inline 2.7 ) -> 2.8
inlineD: brute -> call digits_2 -> call 2.1 -> call 2.2 -> 2.3 -> 2.4 -> 2.5 -> 2.6 -> 2.7 -> 2.8

Performance diff:
inlineB is ~25% faster than inlineA;
inlineC is ~20% faster than inlineB;
inlineD is ~30% faster than inlineC.

The master GCC code now generates inline sequence like inlineB, this patch
makes the ipa-inline pass behavior like inlineD by:
 1) The growth acumulation for recursive calls by adding the growth data
to the edge when edge's caller is inlined into another function to avoid
inline too deeply;
 2) And if caller and callee are both specialized from same node, the edge
should also be considered as recursive edge.

SPEC2017 test shows GEOMEAN improve +2.75% in total(+0.56% without exchange2).
Any comments?  Thanks.

523.xalancbmk_r +1.32%
541.leela_r     +1.51%
548.exchange2_r +31.87%
507.cactuBSSN_r +0.80%
526.blender_r   +1.25%
538.imagick_r   +1.82%

gcc/ChangeLog:

2020-08-12  Xionghu Luo  <luoxhu@linux.ibm.com>

	* cgraph.h (cgraph_edge::recursive_p): Return true if caller and
	callee and specialized from same node.
	* ipa-inline-analysis.c (do_estimate_growth_1): Add caller's
	inlined_to growth to edge whose caller is inlined.
---
 gcc/cgraph.h              | 2 ++
 gcc/ipa-inline-analysis.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 0211f08964f..11903ac1960 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -3314,6 +3314,8 @@ cgraph_edge::recursive_p (void)
   cgraph_node *c = callee->ultimate_alias_target ();
   if (caller->inlined_to)
     return caller->inlined_to->decl == c->decl;
+  else if (caller->clone_of && c->clone_of)
+    return caller->clone_of->decl == c->clone_of->decl;
   else
     return caller->decl == c->decl;
 }
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 148efbc09ef..ba0cf836364 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -434,6 +434,9 @@ do_estimate_growth_1 (struct cgraph_node *node, void *data)
 	  continue;
 	}
       d->growth += estimate_edge_growth (e);
+      if (e->caller->inlined_to)
+	d->growth += ipa_fn_summaries->get (e->caller->inlined_to)->growth;
+
       if (d->growth > d->cap)
 	return true;
     }
-- 
2.25.1



More information about the Gcc-patches mailing list