This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Debugging "Leaks" With Boehm-GC


Hello Everyone:

We are currently experiencing some dire difficulties with one of GCJ's most opaque aspects - garbage collection.

The application exhibiting "leaks" is an XML browser of our own creation. The use case that "leaks" is one where our browser software works its way through interpreting some page elements, and then throws an "event", which is implemented by throwing a Java exception, which something up the stack elements then catches. The software works beautifully, except it "leaks".

I am using quotes around "leaks" because I do not think this problem is a garbage collection fault. I believe it's a problem caused by our application. The test case that shows the "leaking" is a VXML browser test which repeatedly preprocesses then interprets the same cached page. What we can see with GC_STDOUT/GC_PRINT_STATS enabled is that Boehm GC is unable to garbage-collect classes from our own browser implementation that it SHOULD BE ABLE to collect. The inescapable conclusion is that our Java XML browser implementation, although straightforward, well-written, clean, etc. is able somehow to "fake out" Boehm GC, making it think that class instances are still in use although they actually are not. We have inferred through GC statistics that these browser classes are accumulating and are not deallocated, because the GC heap grows without bound, as does the time that GC takes to "mark". We believe these boundless increases of space/time for GC means that it's encountering CRAPLOADS of our browser class instances that it tries to mark for deletion but determines it cannot, traversing these "baby bird" objects over and over again, as they continue to multiply.

This problem is giving me fits because both sides of the equation are so opaque. On one side, you have our reasonably complicated VXML browser implementation that has allocation patterns that are rather hard to understand at all. Then, on the other side you have Boehm GC, which is truly opaque, a sort of "glorified malloc without a free". I am caught in the middle, seemingly without anything to look at.

I am open to any suggestions from the GCJ Elite, but it seems like what I need to do next is to come up with at least an INKLING of what is actually being leaked. The easiest way to do that, it would seem, would be to have GCJ tell me what objects it can't deallocate, after initial startup stuff is done.

Is there some way to do this? I need to know everything there is to know about "GCJ leak detection", and I need to know it yesterday. Someone - anyone - please give me some clues about what to learn and how to approach this. This problem is truly killing us, and I am going to have to move aggresively to fix it. Documentation, Boehm GC debugging tips, slaps in the face, anything at all would be appreciated.

I should mention that we've tried GCJ 3.3, 3.3.2, 3.3.6, and 4.0.2 on arm-wince-pe, ARM-Linux and X86 Linux platforms, and the GC behavior running this failing testcase is IDENTICAL across all platforms/versions.

In closing, let me reiterate: I do not think that this is a "GC problem". I think the only way we'll ever fix this is by making our application run in such a way that it does not create objects that Boehm thinks are ineligible for collection. What I need for the moment is just some ideas on how to go about debugging this kind of problem.

Thanks a million - in advance.

Best Regards,
craig vanderborgh
voxware incorporated


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]