[PATCH AutoFDO]Restoring indirect call value profile transformation
Bin.Cheng
amker.cheng@gmail.com
Wed Dec 19 02:01:00 GMT 2018
On Tue, Dec 18, 2018 at 7:15 PM Bin.Cheng <amker.cheng@gmail.com> wrote:
>
> On Sun, Dec 16, 2018 at 9:11 AM Andi Kleen <ak@linux.intel.com> wrote:
> >
> > "bin.cheng" <bin.cheng@linux.alibaba.com> writes:
> >
> > > Hi,
> > >
> > > Due to ICE and mal-functional bugs, indirect call value profile transformation
> > > is disabled on GCC-7/8/trunk. This patch restores the transformation. The
> > > main issue is AutoFDO should store cgraph_node's profile_id of callee func in
> > > the first histogram value's counter, rather than pointer to callee's name string
> > > as it is now.
> > > With the patch, some "Indirect call -> direct call" tests pass with autofdo, while
> > > others are unstable. I think the instability is caused by poor perf data collected
> > > during regrets run, and can confirm these tests pass if good perf data could be
> > > collected in manual experiments.
> >
> > Would be good to make the tests stable, otherwise we'll just have
> > regressions in the future again.
> >
> > The problem is that the tests don't run long enough and don't get enough samples?
> Yes, take g++.dg/tree-prof/morefunc.C as an example:
> - int i;
> - for (i = 0; i < 1000; i++)
> + int i, j;
> + for (i = 0; i < 1000000; i++)
> + for (j = 0; j < 50; j++)
> g += tc->foo();
> if (g<100) g++;
> }
> @@ -27,8 +28,9 @@ void test1 (A *tc)
> static __attribute__((always_inline))
> void test2 (B *tc)
> {
> - int i;
> + int i, j;
> for (i = 0; i < 1000000; i++)
> + for (j = 0; j < 50; j++)
>
> I have to increase loop count like this to get stable pass on my
> machine. The original count (1000) is too small to be sampled.
>
> >
> > Could add some loop?
> > Or possibly increase the sampling frequency in perf (-F or -c)?
> Maybe, I will have a try.
Turned out all "Indirect call" test can be resolved by adding -c 100
to perf command line:
diff --git a/gcc/config/i386/gcc-auto-profile b/gcc/config/i386/gcc-auto-profile
...
-exec perf record -e $E -b "$@"
+exec perf record -e $E -c 100 -b "$@"
Is 100 too small here? Or is it fine for all scenarios?
Thanks,
bin
> > Or run them multiple times and use gcov_merge to merge the files?
> Without changing loop count or sampling frequency, this is not likely
> to be helpful, since perf doesn't hit the small loop in most cases.
>
> Thanks,
> bin
> >
> >
> > > FYI, an update about AutoFDO status:
> > > All AutoFDO ICEs in regtest are fixed, while several tests still failing fall in below
> > > three categories:
> >
> > Great!
> >
> > Of course it still ICEs with LTO?
> >
> > Right now there is no test case for this I think. Probably one should be added.
> >
> > -Andi
More information about the Gcc-patches
mailing list