This is the mail archive of the
mailing list for the GCC project.
Re: FDO and source changes
- From: Yi Yang <ahyangyi at google dot com>
- To: Xinliang David Li <davidxl at google dot com>
- Cc: Jeff Law <law at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Jan Hubicka <hubicka at ucw dot cz>
- Date: Wed, 23 Jul 2014 10:52:51 -0700
- Subject: Re: FDO and source changes
- Authentication-results: sourceware.org; auth=none
- References: <CAAkRFZ+_sgKGZj07qbkvcw3Emm3Dzmhr0LtLdvnx4nTw2grG=g at mail dot gmail dot com> <53CF2896 dot 1060300 at redhat dot com> <CAAkRFZKa6BY6PFwTaH=EYcUnSf-3yvEj76_8G_tCFUOamTrQLQ at mail dot gmail dot com>
It's worth noting that merely changing the hash function from crc32 to
something that's 64 bit long is enough to make sure collisions does
not happen. Maybe it's worth the trouble?
On Wed, Jul 23, 2014 at 10:42 AM, Xinliang David Li <firstname.lastname@example.org> wrote:
> On Tue, Jul 22, 2014 at 8:14 PM, Jeff Law <email@example.com> wrote:
> > On 07/16/14 14:32, Xinliang David Li wrote:
> >> Instrumentation based FDO is designed to work when the source files
> >> that are used to generate the instr binary match exactly with the
> >> sources in profile-use compile. It is known historically that using
> >> stale profile (due to source changes, not gcda format change) can lead
> >> to lots of mismatch warnings and even worse -- compiler ICEs. This is
> >> due to two reasons:
> >> 1) the profile lookup for each function is based on funcdef_no which
> >> can change when function definition order is changed or new functions
> >> are inserted in the middle of a source
> >> 2) the indirect call target id may change due to source changes:
> >> before GCC4.9, the id uses cgraph uid which is as bad as funcdef_no.
> >> Attributing wrong IC target to the indirect call site is the main
> >> cause of compiler ICE (we have signature match check, but bad target
> >> can leak through result in problem later). Starting from gcc49, the
> >> indirect target profiling uses profile_id which is stable for public
> >> functions.
> >> This patch introduces a new parameter for FDO to determine whether to
> >> use internal id or assembler name based external id for profile
> >> lookup. When the external id is used, GCC FDO will become very
> >> tolerant to simple source changes.
> >> Note that autoFDO solves this problem but it is currently limited to
> >> Intel platforms with LBR support.
> >> (Tested with SPEC, SPEC06 and large internal benchmarks. No performance
> >> impact).
> >> Ok for trunk?
> > Is there a good reason why we would want to ever use the internal id? Is it
> > because the internal id shows up in the profile data for preexisting files?
> I don't think existing profile data matter.
> For perfect fresh profile, using external id has the chance of
> collision. I have tested with a C++ symbol file with about 750k unique
> symbol names, using crc32 based id yields 71 collisions --- the rate
> is ~0.009%.
> > Given that we need both, why is this a param vs a regular -f option?
> > Shouldn't the default be to use the external id?
> I am open to both. I have not seen evidence that id collision causes
> trouble even though in theory it can.
> > BTW, thanks for working on this. I've certainly got customers that want to
> > see the FDO data be more tolerant of changes.
> > Heff