This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
- From: "Yuan, Pengfei" <ypf at pku dot edu dot cn>
- To: gcc-patches at gcc dot gnu dot org
- Cc: richard dot guenther at gmail dot com, hubicka at ucw dot cz
- Date: Sat, 10 Sep 2016 14:04:01 +0800 (GMT+08:00)
- Subject: [PATCH, 5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled
- Authentication-results: sourceware.org; auth=none
Hi,
Previously I have sent a patch on profile based option tuning:
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01377.html
According to Richard Biener's advice, I try investigating where the code size
reduction comes from. After analyzing the dumped IL, I figure out that it is
related to function inlining. Some cold functions are inlined regardless of
profile feedback, which increases code size.
The problem is with the early inliner. In want_early_inline_function_p, if the
estimated edge growth > 0, want_inline depends on maybe_hot_p, which usually
returns true unless optimize_size, since profile feedback is not available at
this point. Some functions which may be cold according to profile feedback are
inlined regardlessly, resulting in code size increase.
At first, I come up with a solution that preloads some profile info before
pass_early_inline. But it fails with numerous coverage-mismatch errors in
pass_ipa_tree_profile. Therefore, the proposed patch prevents early inlining
with positive code size growth if FDO is enabled.
Experiment results are as follows:
Setup
Hardware Core i7-4770, 32GB RAM
OS Debian sid amd64
Compiler GCC 5.4.1 20160907
Firefox source mozilla-central, cset 91c2b9d5c135
Training workload css3test.com, html5test.com, Octane benchmark
Vanilla GCC
Code size (.text of libxul.so) 48708873
Octane benchmark (score) 35828 36618 35847
Kraken benchmark (time) 939.4ms 964.0ms 951.8ms
Patched GCC
Code size (.text of libxul.so) 44686265
Octane benchmark (score) 36103 35740 35611
Kraken benchmark (time) 928.9ms 949.1ms 938.7ms
There is over 8% reduction in code size, while no obvious difference in
performance. The experiment is conducted with GCC 5. There is segmentation
fault when starting Firefox instrumented by GCC 6. GCC 7 encounters ICE when
building Firefox.
Regards,
Yuan, Pengfei
gcc/ChangeLog:
* ipa-inline.c (want_early_inline_function_p): Be more conservative
if FDO is enabled.
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 7097cf3..8266f97 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -628,6 +628,20 @@ want_early_inline_function_p (struct cgraph_edge *e)
if (growth <= 0)
;
+ /* Profile feedback is not available at this point.
+ Be more conservative if FDO is enabled. */
+ else if ((profile_arc_flag && !flag_test_coverage)
+ || (flag_branch_probabilities && !flag_auto_profile))
+ {
+ if (dump_file)
+ fprintf (dump_file, " will not early inline: %s/%i->%s/%i, "
+ "FDO is enabled and code would grow by %i\n",
+ xstrdup_for_dump (e->caller->name ()),
+ e->caller->order,
+ xstrdup_for_dump (callee->name ()), callee->order,
+ growth);
+ want_inline = false;
+ }
else if (!e->maybe_hot_p ()
&& growth > 0)
{