This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
patch Java string interning problems
- To: java-patches at gcc dot gnu dot org
- Subject: patch Java string interning problems
- From: Per Bothner <per at bothner dot com>
- Date: 08 Apr 2001 19:52:28 -0700
This seems to boostrap and pass the tests, but there is a Kawa regression
which may or may not be related. I'd like to understand/fix that before
I check this in; once I do, it should go in both the branch and the trunk.
The first patch is the critical one - _Jv_NewStringUtf8Const was entering
a String into strhash (the intern table) without register a finalizer.
Oops.
The second patch fixes a performance bug - the chaining was using the
same low-order bits of the hash code as the primary index. This means
that all strings that mapped into the same initial slot in strhash
would step through the same sequence of alternatives slots, which leads
to the bunching we wanted to avoid. Another ooops. Using the high-order
bits seems to make sense.
Finally, it seemed better to rehash when 2/3 full rather than waiting
to rehash until we are 3/4 full. This hashing scheme works best when
the table is somewhat sparse, and we can afford to spend a little more
space, given that we don't have the overhead of chaining with links.
If someone feels inspired to do some measurements to see how these fixes
perform (i.e. how much probing is done), that would be really cool.
2001-04-08 Per Bothner <per@bothner.com>
* java/lang/natString.cc (_Jv_NewStringUtf8Const): Register finalizer.
* java/lang/natString.cc (_Jv_StringFindSlot, rehash): Use high-order
bits of hash to calculate step for chaining.
* java/lang/natString.cc (intern, _Jv_NewStringUtf8Const): Rehash
when 2/3 full, rather than 3/4 full.
Index: natString.cc
===================================================================
RCS file: /cvs/gcc/gcc/libjava/java/lang/natString.cc,v
retrieving revision 1.16.4.2
diff -u -p -r1.16.4.2 natString.cc
--- natString.cc 2001/04/01 21:49:48 1.16.4.2
+++ natString.cc 2001/04/08 18:10:16
@@ -62,7 +62,7 @@ _Jv_StringFindSlot (jchar* data, jint le
int index = start_index;
/* step must be non-zero, and relatively prime with strhash_size. */
- int step = 8 * hash + 7;
+ jint step = (hash ^ (hash >> 16)) | 1;
for (;;)
{
jstring* ptr = &strhash[index];
@@ -145,7 +145,7 @@ java::lang::String::rehash()
jstring val = (jstring) UNMASK_PTR (*ptr);
jint hash = val->hashCode();
jint index = hash & (nsize - 1);
- jint step = 8 * hash + 7;
+ jint step = (hash ^ (hash >> 16)) | 1;
for (;;)
{
if (next[index] == NULL)
@@ -166,7 +166,7 @@ jstring
java::lang::String::intern()
{
JvSynchronize sync (&StringClass);
- if (4 * strhash_count >= 3 * strhash_size)
+ if (3 * strhash_count >= 2 * strhash_size)
rehash();
jstring* ptr = _Jv_StringGetSlot(this);
if (*ptr != NULL && *ptr != DELETED_STRING)
@@ -270,7 +270,7 @@ _Jv_NewStringUtf8Const (Utf8Const* str)
chrs -= length;
JvSynchronize sync (&StringClass);
- if (4 * strhash_count >= 3 * strhash_size)
+ if (3 * strhash_count >= 2 * strhash_size)
java::lang::String::rehash();
int hash = str->hash;
jstring* ptr = _Jv_StringFindSlot (chrs, length, hash);
@@ -285,6 +285,8 @@ _Jv_NewStringUtf8Const (Utf8Const* str)
}
*ptr = jstr;
SET_STRING_IS_INTERNED(jstr);
+ // When string is GC'd, clear the slot in the hash table.
+ _Jv_RegisterFinalizer ((void *) str, unintern);
return jstr;
}
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/