FDO and source changes

Xinliang David Li davidxl@google.com
Wed Jul 23 18:00:00 GMT 2014


yes -- I am thinking about this and other better hash functions.
Changing it to 64bit will cause profile format change and increase in
profile data size.

David

On Wed, Jul 23, 2014 at 10:52 AM, Yi Yang <ahyangyi@google.com> wrote:
> It's worth noting that merely changing the hash function from crc32 to
> something that's 64 bit long is enough to make sure collisions does
> not happen. Maybe it's worth the trouble?
>
> On Wed, Jul 23, 2014 at 10:42 AM, Xinliang David Li <davidxl@google.com> wrote:
>>
>> On Tue, Jul 22, 2014 at 8:14 PM, Jeff Law <law@redhat.com> wrote:
>> > On 07/16/14 14:32, Xinliang David Li wrote:
>> >>
>> >> Instrumentation based FDO is designed to work when the source files
>> >> that are used to generate the instr binary match exactly with the
>> >> sources in profile-use compile. It is known historically that using
>> >> stale profile (due to source changes, not gcda format change) can lead
>> >> to lots of mismatch warnings and even worse -- compiler ICEs.  This is
>> >> due to two reasons:
>> >> 1) the profile lookup for each function is based on funcdef_no which
>> >> can change when function definition order is changed or new functions
>> >> are inserted in the middle of a source
>> >> 2) the indirect call target id may change due to source changes:
>> >> before GCC4.9, the id uses cgraph uid which is as bad as funcdef_no.
>> >> Attributing wrong IC target to the indirect call site is the main
>> >> cause of compiler ICE (we have signature match check, but bad target
>> >> can leak through result in problem later). Starting from gcc49, the
>> >> indirect target profiling uses profile_id which is stable for public
>> >> functions.
>> >>
>> >> This patch introduces a new parameter for FDO to determine whether to
>> >> use internal id or assembler name based external id for profile
>> >> lookup. When the external id is used, GCC FDO will become very
>> >> tolerant to simple source changes.
>> >>
>> >> Note that autoFDO solves this problem but it is currently limited to
>> >> Intel platforms with LBR support.
>> >>
>> >> (Tested with SPEC, SPEC06 and large internal benchmarks. No performance
>> >> impact).
>> >>
>> >> Ok for trunk?
>> >
>> > Is there a good reason why we would want to ever use the internal id? Is it
>> > because the internal id shows up in the profile data for preexisting files?
>>
>> I don't think existing profile data matter.
>>
>> For perfect fresh profile, using external id has the chance of
>> collision. I have tested with a C++ symbol file with about 750k unique
>> symbol names, using crc32 based id yields 71 collisions --- the rate
>> is ~0.009%.
>>
>> >
>> > Given that we need both, why is this a param vs a regular -f option?
>> > Shouldn't the default be to use the external id?
>> >
>>
>> I am open to both. I have not seen evidence that id collision causes
>> trouble even though in theory it can.
>>
>> thanks,
>>
>> David
>>
>>
>> > BTW, thanks for working on this.  I've certainly got customers that want to
>> > see the FDO data be more tolerant of changes.
>> >
>> > Heff
>> >



More information about the Gcc-patches mailing list