4.0.0 JNI/reflection invocation 2/4/8 times as slow as in 3.3.3

Per Bothner per@bothner.com
Sat Jan 29 18:59:00 GMT 2005


Ranjit Mathew wrote:
> Instead of changing over to chaining, the current open
> addressing implementation can perhaps be redeemed by
> using a compacting remove(), eliminating the need for
> tombstones.
> 
> This is also described in Knuth's TAOCP Vol. 3 (Sorting
> and Searching) and he shows that it doesn't impact
> performance that much. See section 6.4, Algorithm R (IIRC).

I don't have TAOCP (I bought it in grad school, but
it disappeared ...), but I assume you're talking about
some variant of the following:

   To remove an entry e at index i, after clearing tab[i]:
   j = next_empty_slot(i);
   for (int k = i;;)
    {
      k=(k+1)%tab.length;
      if (k == j) break;
      e = tab[k];
      tab[k] = null;
      if (e != null)
        quick_insert(e);
     }

quick_insert(e) inserts e in the hash-table, but doesn't
need to increment any count or check whether we need to resize,
or whether the item already exists.

The idea is we check if any following entries would get
hashed differently due to the now-evailable empty slot.
Assuming the table isn't too full that shouldn't take
long.

> BTW, why is our default capacity 21 and not a prime
> like (say) 19 or 23? Primes are much better for hash
> table sizes (again, refer to TAOCP Vol. 3).

I seem to remember that powers-of-2 is not unreasonable.
(You have to make sure using a probe increment is relatively
prime - which 1 is.)  At least on some embedded chips
a modulo operations could be fairly expensive.  On modern
desktop-and-above processors what we care about is cache
misses, so division doesn't matter.  (Note that linear
probing is likely to work better for caches.)
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/



More information about the Java mailing list