This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 4.0.0 JNI/reflection invocation 2/4/8 times as slow as in 3.3.3


Tom Tromey wrote:
> Jost> Imagine a program which allocates and immediately removes a key.  If you
> Jost> let the hash() distribute the keys all over the table, your table gets
> Jost> filled with tombstones.  And since a tombstone prevents a key to be
> Jost> entered immediately, our hash table has degenerated into a linear list
> Jost> which could have a length of up to 75% of it.  What would help is to
> Jost> change the implementation to allow more than one key at a location, for
> Jost> example by storing the additional keys in a separate list or by building
> Jost> clusters.
> 
> That's interesting.  Could you write a patch that includes an
> explanation of what is going on internally?  And ideally do some
> profiling so we can see the result?

Instead of changing over to chaining, the current open
addressing implementation can perhaps be redeemed by
using a compacting remove(), eliminating the need for
tombstones.

This is also described in Knuth's TAOCP Vol. 3 (Sorting
and Searching) and he shows that it doesn't impact
performance that much. See section 6.4, Algorithm R (IIRC).

Also note that at least in the JavaDocs for this class
in JDK 1.4.2, Sun says that this is an open addressing
hash table using a linear probe with alternating key/value
pairs. It also says that HashMap uses chaining instead.

BTW, why is our default capacity 21 and not a prime
like (say) 19 or 23? Primes are much better for hash
table sizes (again, refer to TAOCP Vol. 3).

On each expansion, instead of going for just 2n+1, we
should go for the nearest prime to 2n. This is not
as bogus as it sounds as we can easily pre-compute and
store the 26 or so primes that would be needed for a
default starting of, say, 23. Note that the array
length is capped at 2^31.

If the user supplies an explicit expectedMaxSize, we
can still choose the nearest larger prime from our
table.

Thoughts?

Ranjit.

-- 
Ranjit Mathew      Email: rmathew AT gmail DOT com

Bangalore, INDIA.    Web: http://ranjitmathew.hostingzero.com/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]