This is the mail archive of the
java-patches@sources.redhat.com
mailing list for the Java project.
Re: Patch: New hash function
Jeff Sturm writes:
> Tom Tromey wrote:
> > + return (jint) ((unsigned long long) obj % 0x7fffffff);
>
> Do we have any statistics showing an improvement with this new
> hashcode function?
Well, the old one wasn't 64-bit clean, so it had to change somehow.
Reduction modulo a prime guarantees a good spread even when the heap
straddles 32-bit boundaries, which doesn't happen much today but will
in the future; plan ahead!
If we're going to hash a long address into a shorter int, we might as
well do the reduction in a nonlinear way as long as it's not unduly
slow.
> I'm not too crazy about the 64-bit modulus operator. I tested it on a couple of
> machines I use:
>
> CPU Old New
> ------------ --- ---
> PIII, 600MHz 39 159
> EV56, 533MHz 5 30
>
> where "Old" and "New" are the previous and current hashcode
> functions, respectively, and times are an average measured in
> cycles, using the rdtsc or rpcc instructions. Both tests were
> compiled with gcc -O2.
I had hoped that gcc would optimize most of this away, but it seems
not. I certainly didn't expect a call to the double divide routine,
but a perusal of i386.md reveals that this will always happen even
though it isn't necessary.
The modulus (2^31 - 1) was chosen because it is prime, and thus can be
expected always to have a good distribution no matter what addresses
we're given. However, if the i386 code for this is slow it's not any
kind of panacea.
We could conditionalize this so that there's effectively no difference
from the current function on 32-bit hosts:
jint
_Jv_Other_HashCode (jobject obj)
{
#if SIZEOF_VOID_P == 4
return (jint)obj;
#else
unsigned long quo = ((unsigned long)obj) >> 31;
unsigned long rem = ((unsigned long)obj) & 0x7fffffffU;
quo += rem; // quo is congruent to obj, mod (2^31 - 1).
return (jint)quo;
#endif
}
Strictly speaking, to ensure that the result is correct mod (2^31 - 1)
we'd need to do something like this:
while (quo >> 32)
quo -= (0x7fffffffU * 2), count++;
but it seems hardly worthwhile.
Andrew.