Java Collections are generally expected to be safe in the presence of concurrent reads.
However many operations that read WeakHashMaps process the reference queue as part of the operation, and may thus delete elements from the WeakHashMap. As a result, if two threads concurrently call e.g. size(), two threads may end up concurrently deleting elements from the WeakHashMap, resulting in a damaged data structure.
(This was based on code inspection, that failure has not been observed. But an actual failure would be irreproducible, and hence probably not result in a bug report.)
An off-line discussion concluded that the right way to fix this is probably to include enough internal locking to ensure that two concurrent readers cannot interfere.
A corresponding bug report (6425537) was filed against the Sun JVM, based on the same discussion.
I wonder if this is related to PR 18187.
Created attachment 12515 [details]
The test case that passes
I wrote a simple test case, where 100 threads are reading from the same weak hash map, and the size of map is gradually decreasing when the entries are gc collected. Both in Sun's and ours implementation the test seems passing (either null or correct entry is returned). Hence I cannot reproduce this bug and would suggest to close this as unreproducible.
I strongly disagree with closing this. This is a threading bug. It's nasty precisely because it is not systematically reproducible. That's no reason to close it.
The problem is obviously still there. Various readers call cleanQueue, which calls internalRemove, which updates the data structure all without synchronization.
The test case may have failed to catch it either because it doesn't do a good job of testing for the kind of corruption that may occur here (lost deletions, size decrements), or because it was run on two few processors to make the failure likely, or because a failure in a hash table this large is probably unlikely anyway. A test case that actually reproduces an obscure threading bug like this is valuable; in my opinion, the fact that a test case doesn't fail doesn't mean much in cases like this.
It makes sense to close unreporducible bugs only if we can't track them down as a result. We already understand the problem here.