This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: patch Java string interning problems


Per Bothner <per@bothner.com> writes:

> This seems to boostrap and pass the tests, but there is a Kawa regression
> which may or may not be related.  I'd like to understand/fix that before
> I check this in; once I do, it should go in both the branch and the trunk.

Yes, it did turn out to be related:  _Jv_NewStringUtf8Const was using the
16-bit hashcode from the Utf8Const and passing it to _Jv_StringFindSlot.
Before that was usually ok, since _Jv_StringFindSlot only used the
log2(strhash_size) lower-order bits of the hash.  However, there was a
potential problem if the strhash table got large enough to use more than
16 bits of hash.  With the change that _Jv_StringFindSlot and rehash
use the high-order bits to set the step value for successive probing it
became much more likely that things would break.  And they did, in Kawa.

I've checked this into both the trunk and the branch.  I've tested the
branch (bootstrap + make check in libjava + make all check in Kawa).

I'm still unable to check the trunk because I *still* can't build the
trunk - gcc crashes while compiling libstdc++-v3.

2001-04-08  Per Bothner  <per@bothner.com>

	* java/lang/natString.cc (_Jv_NewStringUtf8Const):  Register finalizer.
	Recalculate hash, since Utf8Const's hash is only 16 bits.

	* java/lang/natString.cc (_Jv_StringFindSlot, rehash):  Use high-order
	bits of hash to calculate step for chaining.

	* java/lang/natString.cc (intern, _Jv_NewStringUtf8Const):  Rehash
	when 2/3 full, rather than 3/4 full.

Index: java/lang/natString.cc
===================================================================
RCS file: /cvs/gcc/gcc/libjava/java/lang/natString.cc,v
retrieving revision 1.16.4.2
diff -u -p -r1.16.4.2 natString.cc
--- natString.cc	2001/04/01 21:49:48	1.16.4.2
+++ natString.cc	2001/04/10 22:00:17
@@ -62,7 +62,7 @@ _Jv_StringFindSlot (jchar* data, jint le
 
   int index = start_index;
   /* step must be non-zero, and relatively prime with strhash_size. */
-  int step = 8 * hash + 7;
+  jint step = (hash ^ (hash >> 16)) | 1;
   for (;;)
     {
       jstring* ptr = &strhash[index];
@@ -145,7 +145,7 @@ java::lang::String::rehash()
 	  jstring val = (jstring) UNMASK_PTR (*ptr);
 	  jint hash = val->hashCode();
 	  jint index = hash & (nsize - 1);
-	  jint step = 8 * hash + 7;
+	  jint step = (hash ^ (hash >> 16)) | 1;
 	  for (;;)
 	    {
 	      if (next[index] == NULL)
@@ -166,7 +166,7 @@ jstring
 java::lang::String::intern()
 {
   JvSynchronize sync (&StringClass);
-  if (4 * strhash_count >= 3 * strhash_size)
+  if (3 * strhash_count >= 2 * strhash_size)
     rehash();
   jstring* ptr = _Jv_StringGetSlot(this);
   if (*ptr != NULL && *ptr != DELETED_STRING)
@@ -265,14 +265,18 @@ _Jv_NewStringUtf8Const (Utf8Const* str)
       chrs = JvGetStringChars(jstr);
     }
 
+  jint hash = 0;
   while (data < limit)
-    *chrs++ = UTF8_GET(data, limit);
+    {
+      jchar ch = UTF8_GET(data, limit);
+      hash = (31 * hash) + ch;
+      *chrs++ = ch;
+    }
   chrs -= length;
 
   JvSynchronize sync (&StringClass);
-  if (4 * strhash_count >= 3 * strhash_size)
+  if (3 * strhash_count >= 2 * strhash_size)
     java::lang::String::rehash();
-  int hash = str->hash;
   jstring* ptr = _Jv_StringFindSlot (chrs, length, hash);
   if (*ptr != NULL && *ptr != DELETED_STRING)
     return (jstring) UNMASK_PTR (*ptr);
@@ -285,6 +289,8 @@ _Jv_NewStringUtf8Const (Utf8Const* str)
     }
   *ptr = jstr;
   SET_STRING_IS_INTERNED(jstr);
+  // When string is GC'd, clear the slot in the hash table.
+  _Jv_RegisterFinalizer ((void *) jstr, unintern);
   return jstr;
 }
 

-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]