This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Andi Kleen <ak at linux dot intel dot com>
- Cc: Andi Kleen <andi at firstfloor dot org>, gcc-patches at gcc dot gnu dot org, hubicka at ucw dot cz
- Date: Tue, 23 Apr 2013 16:22:13 +0200
- Subject: Re: [PATCH 3/9] Use murmurhash3 for pointer map hashing
- References: <1366407117-18462-1-git-send-email-andi at firstfloor dot org> <1366407117-18462-4-git-send-email-andi at firstfloor dot org> <CAFiYyc3btjf-4X2ozSw+9pAEtuzOyoaPejzoJQ-Myvf27e2S=A at mail dot gmail dot com> <20130422154241 dot GW22166 at tassilo dot jf dot intel dot com> <CAFiYyc3kCpVDBJUx2963e+PV22H+Seq5X4zX+RYApEJfWcgCpQ at mail dot gmail dot com> <20130423140858 dot GA22166 at tassilo dot jf dot intel dot com>
On Tue, Apr 23, 2013 at 4:08 PM, Andi Kleen <email@example.com> wrote:
> On Tue, Apr 23, 2013 at 12:27:43PM +0200, Richard Biener wrote:
>> On Mon, Apr 22, 2013 at 5:42 PM, Andi Kleen <firstname.lastname@example.org> wrote:
>> > On Mon, Apr 22, 2013 at 01:46:58PM +0200, Richard Biener wrote:
>> >> On Fri, Apr 19, 2013 at 11:31 PM, Andi Kleen <email@example.com> wrote:
>> >> > From: Andi Kleen <firstname.lastname@example.org>
>> >> >
>> >> > For a large LTO test case The previous pointer hash change brought
>> >> > the collision rate for the WPA gimple type hash table from 90% to
>> >> > 70. This patch uses the well known murmur3 to improve it further
>> >> > to 64%.
>> >> But if they are pointers then pointer_hash should be good enough... ?
>> > The original pointer hash (ptr >> 3) % hashsize and throwing away most bits is
>> > very poor.
>> > The evahash based on I sent earlier is better, but murmur3 is even better than
>> > that, at least for this case.
>> I'd rather not have different pointer hashes for things where there isn't a
>> fundamental difference between the pointer values.
> One of the reasons I did it explicitely is that the murmur3 reference code is C++
> (well really only the mixed code/declarations I think) and libiberty seems to be
> C only. I suppose can port it to C and put it into libiberty though, and make
> hashtab.c always use it.
> My understanding is that murmur is generally stronger than evahash.
It would be nice to back this with some numbers on the collision rate
for GCC hashtables, for example during bootstrap (or just for a set of
.ii files from libbackend.a sources for example). pointer-set.c also
contains its own hash function but doesn't contain infrastructure for
statistics. Also the new hash-table.h C++-style hashtable has its
own pointer-hash which is still
template <typename Type>
pointer_hash <Type>::hash (const value_type *candidate)
/* This is a really poor hash function, but it is what the current code uses,
so I am reusing it to avoid an additional axis in testing. */
return (hashval_t) ((intptr_t)candidate >> 3);
> email@example.com -- Speaking for myself only