This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: String hashCode
- To: Per Bothner <per at bothner dot com>
- Subject: Re: String hashCode
- From: Jeff Sturm <jeff dot sturm at commerceone dot com>
- Date: Wed, 31 Jan 2001 19:38:02 -0500
- CC: Bryce McKinlay <bryce at albatross dot co dot nz>, java at gcc dot gnu dot org
- Organization: Commerce One
- References: <3A789EDF.50A39A6C@albatross.co.nz> <m2puh3z39r.fsf@kelso.bothner.com>
Per Bothner wrote:
> Bryce McKinlay <bryce@albatross.co.nz> writes:
> > This would speed up String.equals() by being able to return false
> > quickly in the common case, as well as speeding up things like
> > Hashtables of Strings and string interning.
>
> Its a reasonable thing to do - but there are some costs.
The interpreter benchmarks[*] I've run on libgcj exercise String and Hashtable a
great deal. I have /never/ seen String.hashCode or String.equals rise above my
measurement threshold of 1%. I have no doubt there is code in which
String.equals is a bottleneck, but I suspect it is rare.
Keep in mind that java.lang.String exhibits good spatial locality since the
String object is allocated consecutively with its char[] buffer. Small String
objects may even fit in one cache block. Once you take away the possibility of
cache misses, the actual hashCode computation will be very fast on any modern
(>500MHz) processor.
One alternative could be optimizing the collection classes instead, since that
is often where String.equals is called. For example, a HashMap could store the
hashCode for each of its entries, since the hashCode is always computed when a
new entry is created anyhow. There is still the tradeoff of time vs. space.
[*] - Here is a sample of cycle counts from my Jacl tests. Hashtable.get does
appear, though not a single String method. Interestingly, overall time is
almost perfectly balanced between the application, libgcj and libgcjgc:
Sample Image Total
Name Count Pct Pct
---- ----- --- ---
/home/jsturm/bin/jaclsh 5469 2.5 *
eh_context_specific 3210 58.7 1.4
/opt/gcj/lib/libgcjgc.so.1.0.1 61535 27.6 *************
GC_malloc 13089 21.3 5.9 **
GC_mark_from_mark_stack 36383 59.1 16.3 ********
GC_reclaim_clear 3574 5.8 1.6
GC_reclaim_clear4 2726 4.4 1.2
/home/jsturm/lib/libjacl.so 62228 27.9 *************
_ZN3tcl4lang10Expression7ExprLexEPNS0_ 3757 6.0 1.7
_ZN3tcl4lang6Parser12parseCommandEPNS0 6632 10.7 3.0 *
_ZN3tcl4lang6Parser10evalTokensEPNS0_6 3421 5.5 1.5
_ZN3tcl4lang6Parser5eval2EPNS0_6Interp 4789 7.7 2.1 *
_ZN3tcl4lang6Parser8parseVarEPNS0_6Int 2741 4.4 1.2
_ZN3tcl4lang6Parser11parseTokensEP6JAr 2238 3.6 1.0
_ZN3tcl4lang8TclParse8getTokenEi 4342 7.0 1.9
_ZN3tcl4lang8TclParse7releaseEv 2565 4.1 1.2
_ZN3tcl4lang3Var6getVarEPNS0_6InterpEP 3989 6.4 1.8
_ZN3tcl4lang3Var6setVarEPNS0_6InterpEP 2489 4.0 1.1
/opt/gcj/lib/libgcj.so.1.0.0 62114 27.8 *************
_Jv_AllocObject 6423 10.3 2.9 *
_Jv_NewPrimArray 2710 4.4 1.2
_Z12table_searchPKA2_wiw 4011 6.5 1.8
_Jv_CheckArrayStore 6169 9.9 2.8 *
_Jv_InitClass 3740 6.0 1.7
_ZN4java4util9Hashtable3getEPNS_4lang6 3147 5.1 1.4
_Z12_Jv_AllocObjiPN4java4lang5ClassE 3385 5.4 1.5
/lib/libpthread-0.8.so 7573 3.4 *
__pthread_getspecific 3301 43.6 1.5
/lib/libc-2.1.3.so 18084 8.1 ****
memset_loop 10212 56.5 4.6 **
__divqu 5037 27.9 2.3 *
/lib/libnss_files-2.1.3.so 4787 2.1 *
__do_global_ctors_aux 4785 100.0 2.1 *
--
Jeff Sturm
jeff.sturm@commerceone.com