This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: 4.0.0 JNI/reflection invocation 2/4/8 times as slow as in 3.3.3
- From: Ranjit Mathew <rmathew at gmail dot com>
- To: Per Bothner <per at bothner dot com>
- Cc: tromey at redhat dot com, java at gcc dot gnu dot org
- Date: Mon, 7 Feb 2005 16:55:27 +0530
- Subject: Re: 4.0.0 JNI/reflection invocation 2/4/8 times as slow as in 3.3.3
- References: <20050123191147.29455.qmail@web60102.mail.yahoo.com> <m3r7kbhkli.fsf@localhost.localdomain> <1106568719.19552.64.camel@diego.intern.de> <m3psztxns5.fsf@localhost.localdomain> <41F9CEF2.8000200@gmail.com> <41FBDCF0.7010401@bothner.com>
- Reply-to: Ranjit Mathew <rmathew at gmail dot com>
On Sat, 29 Jan 2005 10:58:56 -0800, Per Bothner <per@bothner.com> wrote:
> Ranjit Mathew wrote:
> > Instead of changing over to chaining, the current open
> > addressing implementation can perhaps be redeemed by
> > using a compacting remove(), eliminating the need for
> > tombstones.
> >
> > This is also described in Knuth's TAOCP Vol. 3 (Sorting
> > and Searching) and he shows that it doesn't impact
> > performance that much. See section 6.4, Algorithm R (IIRC).
>
> I don't have TAOCP (I bought it in grad school, but
> it disappeared ...), but I assume you're talking about
> some variant of the following:
>
> To remove an entry e at index i, after clearing tab[i]:
> j = next_empty_slot(i);
> for (int k = i;;)
> {
> k=(k+1)%tab.length;
> if (k == j) break;
> e = tab[k];
> tab[k] = null;
> if (e != null)
> quick_insert(e);
> }
>
> quick_insert(e) inserts e in the hash-table, but doesn't
> need to increment any count or check whether we need to resize,
> or whether the item already exists.
>
> The idea is we check if any following entries would get
> hashed differently due to the now-evailable empty slot.
> Assuming the table isn't too full that shouldn't take
> long.
Knuth's Algorithm R is a bit different (see pages 533,534
in the second edition of TAOCP Volume 3: "Sorting and
Searching", along with the _important_ errata for this
volume).
Suppose we have determined that the i-th
element in the table needs to go. Suppose the
table has a total of M slots. Then assuming we
use backward linear probing:
R1. Delete table[i]. Set j := i
R2. Decrement i. if (i < 0) then i := i + M
R3. If table[i] is empty, terminate. Otherwise, set
r := hashCode(key[i]). If (i <= r < j) or (r < j < i)
or (j < i <= r) (that is, if r is cyclically between
j and i), go to R2.
R4. Set table[j] := table[i]. Go to R1.
(Note: Without the errata, the last line has
"Go to R2" in the published volume, which
is wrong.)
FWIW, I implemented this (perhaps naively) for
the current IdentityHashMap implementation
and did not see any improvement in Jost's
posted test - if anything, I saw a slight
degradation in performance. (BTW, "gij"
fared far better than "java" from JDK 1.4.2_03
in Jost's testcase, with or without this.)
So for this particular test at least, Algorithm
R might not give us any improvement, though
for other tests, it might.
Note that for the case of repeated insertion/deletion
of the same key, we do reuse a tombstone slot as
soon as we determine that the key is not present.
We also expand and rehash stuff as soon as
we reach the acceptable threshold (0.75) of
total table capacity.
It does get rid of tombstones (for the key) and
*should* make the lookups faster, though I did
not write a benchmark test for this.
(I retained tombstones for the value slot to
determine if a key that otherwise existed has
been deleted.)
Ranjit.
--
Ranjit Mathew Email: rmathew AT gmail DOT com
Bangalore, INDIA. Web: http://ranjitmathew.hostingzero.com/