4.0.0 JNI/reflection invocation 2/4/8 times as slow as in 3.3.3
Fri Jan 28 09:48:00 GMT 2005
Ranjit Mathew writes:
> Tom Tromey wrote:
> > Jost> Imagine a program which allocates and immediately removes a key. If you
> > Jost> let the hash() distribute the keys all over the table, your table gets
> > Jost> filled with tombstones. And since a tombstone prevents a key to be
> > Jost> entered immediately, our hash table has degenerated into a linear list
> > Jost> which could have a length of up to 75% of it. What would help is to
> > Jost> change the implementation to allow more than one key at a location, for
> > Jost> example by storing the additional keys in a separate list or by building
> > Jost> clusters.
> > That's interesting. Could you write a patch that includes an
> > explanation of what is going on internally? And ideally do some
> > profiling so we can see the result?
> Instead of changing over to chaining, the current open
> addressing implementation can perhaps be redeemed by
> using a compacting remove(), eliminating the need for
> This is also described in Knuth's TAOCP Vol. 3 (Sorting
> and Searching) and he shows that it doesn't impact
> performance that much. See section 6.4, Algorithm R (IIRC).
That's a good idea if we're using linear probing to handle collisions.
But I can't see that we handle collisions at all. Look at this:
public Object put(Object key, Object value)
// Rehash if the load factor is too high.
if (size > threshold)
int h = hash(key);
if (table[h] == key)
Object r = table[h + 1];
table[h + 1] = value;
// At this point, we add a new mapping.
table[h] = key;
table[h + 1] = value;
Aren't we overwriting an existing key here? Surely we need to
increment h until an empty entry is found.
> Also note that at least in the JavaDocs for this class
> in JDK 1.4.2, Sun says that this is an open addressing
> hash table using a linear probe with alternating key/value
> pairs. It also says that HashMap uses chaining instead.
> BTW, why is our default capacity 21 and not a prime
> like (say) 19 or 23? Primes are much better for hash
> table sizes (again, refer to TAOCP Vol. 3).
Well, that depends. If the key really is a random hash function it
doesn't matter much. However, our hash functions aren't very random,
so we should do this.
> On each expansion, instead of going for just 2n+1, we
> should go for the nearest prime to 2n. This is not
> as bogus as it sounds as we can easily pre-compute and
> store the 26 or so primes that would be needed for a
> default starting of, say, 23. Note that the array
> length is capped at 2^31.
> If the user supplies an explicit expectedMaxSize, we
> can still choose the nearest larger prime from our
Yes. The code to do this is in BigInteger, which has all the speedups
for small integers.
More information about the Java