This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH AutoFDO]Restoring indirect call value profile transformation
- From: Jeff Law <law at redhat dot com>
- To: "bin.cheng" <bin dot cheng at linux dot alibaba dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 13 Dec 2018 11:48:28 -0700
- Subject: Re: [PATCH AutoFDO]Restoring indirect call value profile transformation
- References: <email@example.com>
On 12/12/18 8:50 PM, bin.cheng wrote:
> Due to ICE and mal-functional bugs, indirect call value profile transformation
> is disabled on GCC-7/8/trunk. This patch restores the transformation. The
> main issue is AutoFDO should store cgraph_node's profile_id of callee func in
> the first histogram value's counter, rather than pointer to callee's name string
> as it is now.
> With the patch, some "Indirect call -> direct call" tests pass with autofdo, while
> others are unstable. I think the instability is caused by poor perf data collected
> during regrets run, and can confirm these tests pass if good perf data could be
> collected in manual experiments.
> Bootstrap and test along with previous patches. Is it OK?
> FYI, an update about AutoFDO status:
> All AutoFDO ICEs in regtest are fixed, while several tests still failing fall in below
> three categories:
> Unstable indirect call value profile transformation:
> FAIL: g++.dg/tree-prof/indir-call-prof.C scan-ipa-dump afdo "Indirect call -> direct call.* AA transformation on insn"
> FAIL: g++.dg/tree-prof/morefunc.C scan-ipa-dump-times afdo "Indirect call -> direct call" 2
> FAIL: g++.dg/tree-prof/pr35545.C scan-ipa-dump profile_estimate "Indirect call -> direct call"
> loop peeling case because we don't honor autofdo profile count as reliable:
> FAIL: gcc.dg/tree-prof/peel-1.c scan-tree-dump cunroll "Peeled loop ., 1 times"
> cold/hot partition cases:
> FAIL: gcc.dg/tree-prof/cold_partition_label.c scan-assembler foo[._]+cold
> FAIL: gcc.dg/tree-prof/cold_partition_label.c scan-assembler size[ \ta-zA-Z0-0]+foo[._]+cold
> FAIL: gcc.dg/tree-prof/section-attr-1.c scan-assembler .section[\t ]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold
> FAIL: gcc.dg/tree-prof/section-attr-2.c scan-assembler .section[\t ]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold
> FAIL: gcc.dg/tree-prof/section-attr-3.c scan-assembler .section[\t ]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold
> These are more difficult to enable because we can't simply treat autofdo::zero
> count as cold, it's just too many.
> Besides regtest, I run autofdo with kernel/mysql-server, the build and performance
> match expectations now, but I haven't run autofdo with any spec yet.
> 2018-12-13 Bin Cheng <firstname.lastname@example.org>
> * auto-profile.c (afdo_indirect_call): Skip generating histogram
> value if we can't find cgraph_node for then indirected callee. Save
> profile_id of the cgraph_node in histogram value's first counter.
> * value-prof.c (gimple_value_profile_transformations): Don't skip
> for flag_auto_profile.