[PATCH AutoFDO]Restoring indirect call value profile transformation

Bin.Cheng amker.cheng@gmail.com
Wed Dec 19 02:01:00 GMT 2018


On Tue, Dec 18, 2018 at 7:15 PM Bin.Cheng <amker.cheng@gmail.com> wrote:
>
> On Sun, Dec 16, 2018 at 9:11 AM Andi Kleen <ak@linux.intel.com> wrote:
> >
> > "bin.cheng" <bin.cheng@linux.alibaba.com> writes:
> >
> > > Hi,
> > >
> > > Due to ICE and mal-functional bugs, indirect call value profile transformation
> > > is disabled on GCC-7/8/trunk.  This patch restores the transformation.  The
> > > main issue is AutoFDO should store cgraph_node's profile_id of callee func in
> > > the first histogram value's counter, rather than pointer to callee's name string
> > > as it is now.
> > > With the patch, some "Indirect call -> direct call" tests pass with autofdo, while
> > > others are unstable.  I think the instability is caused by poor perf data collected
> > > during regrets run, and can confirm these tests pass if good perf data could be
> > > collected in manual experiments.
> >
> > Would be good to make the tests stable, otherwise we'll just have
> > regressions in the future again.
> >
> > The problem is that the tests don't run long enough and don't get enough samples?
> Yes, take g++.dg/tree-prof/morefunc.C as an example:
> -  int i;
> -  for (i = 0; i < 1000; i++)
> +  int i, j;
> +  for (i = 0; i < 1000000; i++)
> +    for (j = 0; j < 50; j++)
>       g += tc->foo();
>     if (g<100) g++;
>  }
> @@ -27,8 +28,9 @@ void test1 (A *tc)
>  static __attribute__((always_inline))
>  void test2 (B *tc)
>  {
> -  int i;
> +  int i, j;
>    for (i = 0; i < 1000000; i++)
> +    for (j = 0; j < 50; j++)
>
> I have to increase loop count like this to get stable pass on my
> machine.  The original count (1000) is too small to be sampled.
>
> >
> > Could add some loop?
> > Or possibly increase the sampling frequency in perf (-F or -c)?
> Maybe, I will have a try.
Turned out all "Indirect call" test can be resolved by adding -c 100
to perf command line:
diff --git a/gcc/config/i386/gcc-auto-profile b/gcc/config/i386/gcc-auto-profile
...
-exec perf record -e $E -b "$@"
+exec perf record -e $E -c 100 -b "$@"

Is 100 too small here?  Or is it fine for all scenarios?

Thanks,
bin

> > Or run them multiple times and use gcov_merge to merge the files?
> Without changing loop count or sampling frequency, this is not likely
> to be helpful, since perf doesn't hit the small loop in most cases.
>
> Thanks,
> bin
> >
> >
> > > FYI, an update about AutoFDO status:
> > > All AutoFDO ICEs in regtest are fixed, while several tests still failing fall in below
> > > three categories:
> >
> > Great!
> >
> > Of course it still ICEs with LTO?
> >
> > Right now there is no test case for this I think. Probably one should be added.
> >
> > -Andi



More information about the Gcc-patches mailing list