This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: std::collate hash() still broken


In a message dated 10/02/2002 04:33:17 a.m. Pacific Standard Time, pcarlini@unitus.it writes:
>Personally, I would like to see cited in a comment a >relevant piece of literature 

I pulled up this url.. the site is unfortunately dead :(  (it was one of the best of hashing on the web) but it is in Google's cache:
http://www.google.com/search?q=cache:lTJRondwlQgC:burtleburtle.net/bob/hash/doobs.html+hash+hashing+rotate&hl=en

It uses a form of the rotary hash using 4 bit rotate.. Scroll to the bottom for a nice comparison table.

I also checked Knuth3 and it didnt say much.. it mentioned a kind of multiplicative hash which is similar to this.. but unfortunately i think runs much slower.

I think the reason it has to be a rotary shift is that if it wasnt an even rotate, we would either:
1) discard bits: this is obviously bad.. "randomess" would be discarded for no apparent reason
2) compacting bits: this is also unfortunately somehow..
there is some stuff about it that i do not understand at: http://www.google.com/search?q=cache:wFLjFdON2xEC:burtleburtle.net/bob/hash/funnels.html+&hl=en
also, if bits are compacted, each bit is affecting one less bit per bit of input.  you want each bit of input to affect as many bits of output.

however, again, i know little of the real theory of this stuff.  on the other hand, the first url i pasted leads me to beleive the rotary hash function would be the most suitable for the collate facet: 1) it is reasonably uniform 2) it is very fast

Thanks,

AaronWL


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]