GC leaks debugging

Boehm, Hans hans.boehm@hp.com
Tue Apr 5 04:44:00 GMT 2011


I'm still concerned about the amount of blacklisting here.  Can you track down some of those messages or calls to GC_add_to_black_list_normal, and find out where those bogus pointer-like bit patterns are coming from?

Is there any way to get the reflection information (and exception information? or was that fixed?) into read-only segments, so that the collector can know not to scan them?

Hans

> -----Original Message-----
> From: java-owner@gcc.gnu.org [mailto:java-owner@gcc.gnu.org] On Behalf
> Of Andrew Haley
> Sent: Monday, April 04, 2011 2:48 AM
> To: java@gcc.gnu.org
> Subject: Re: GC leaks debugging
> 
> On 04/04/2011 09:52 AM, Erik Groeneveld wrote:
> > On Mon, Apr 4, 2011 at 10:13 AM, Andrew Haley <aph@redhat.com> wrote:
> >> On 03/04/11 18:59, Erik Groeneveld wrote:
> >>> On Sun, Apr 3, 2011 at 7:14 PM, Erik Groeneveld <erik@cq2.nl>
> wrote:
> >>>> On Sat, Apr 2, 2011 at 11:38 AM, Erik Groeneveld <erik@cq2.nl>
> wrote:
> >>>>>
> >>>>>> Note that in the information you posted, the GC was scanning
> around 7.5MB of roots conservatively.  It might be worth checking what
> those regions are.
> >>> [...]
> >>>> So I am now off into JvCreateJavaVM,
> >>>
> >>> and I found that the 7.5 MB roots are the static data area of
> libgcj
> >>> itself.  The GC calls back -- the last arg being the size:
> >>>
> >>> _Jv_GC_has_static_roots(../gccinstall/lib/libgcj.so.12, 0xb704f000,
> 7544028)
> >>>
> >>> and since libgcj is in 'the store' (_Jv_print_gc_store() prints
> >>> "../gccinstall/lib/libgcj.so.12"), it tells the GC to scan its
> static
> >>> data area conservatively.
> >>>
> >>> As of yet I don't understand why this static area is so big, and
> what
> >>> could be on it, but when I lay myself to rest, the little gray
> cells
> >>> will sing to me (free after Hercules Poirot ;-).
> >>
> >> It'll mostly be introspection data.  Every class and every method
> has
> >> this, and it can get to be quite large.
> >
> > I saw the (old) patch of yours that moves static Java objects onto
> the
> > heap, avoiding it to be scanned conservatively, so I couldn't think
> of
> > anything else to be on the static data area of libgcj than Java
> > pointers to the objects heap.  Now I see that there is still a lot
> > more data that must be scanned conservatively, so couldn't there be
> > similar problems as back then?  Couldn't it be an idea to try to move
> > this introspection data to the heap as well?
> 
> It's certainly possible, but you can't move all of it with our current
> design because it doesn't play nicely with CNI.  Therefore, a lot of
> libgcj is compiled with static introspection data that must be
> conservatively scanned.
> 
> >> I doubt it's the cause of
> >> your memory leak unless there's a bug elsewhere.
> >
> > Probably there is no clear bug, or clear leak, perhaps just a matter
> > of pushing the GC to the limits?
> 
> I doubt that very much.  This has come up several times in the past,
> and the problem has never been the garbage collector recognizing false
> positives.  It's almost certainly a real memory leak caused by a
> pointer somewhere not being nulled.
> 
> > Some code is running quite well for long times, other isn't.  I all
> > cases, the heap grows very fast, with lots of black listing
> > messages, and sometimes the GC just seems to manage, sometimes it
> > doesn't and things explode while issuing the famous "need to
> > allocate large block etc" repeatedly.  From what Hans suggested and
> > from what I see in the logs, the GC is under very heavy stress,
> > right from the beginning.  It doesn't get a fair chance so to say.
> 
> I don't think so.
> 
> > My minimal program is now this:
> >
> > int main(int argc, char *argv[]) {
> >     _Jv_InitGC();
> > }
> >
> > It starts out with:
> >
> > roots: 7,072 kB
> > heap: 64 kB
> > free: 64 kB
> > blacklisted: 15/16
> > blacklist messages: 991
> >
> > Any real program produces so much blacklist messages that it hardly
> > runs.  I'd like to investigate this or am I on the wrong track
> > completely?
> 
> I think you are.  The heap is small in this simple test case, so there
> are no real problems.
> 
> You need to find out what the real problem is.  Find just one of those
> "need to allocate large block" messages, and find out why it is being
> called.  I suspect that there is an actual bug that is causing the
> explosion and it can be found.  Forget about 991 blacklist messages:
> not useful.
> 
> I'd have a look myself, but there is no way to duplicate your problem.
> 
> BTW, is this on a 32-bit or 64-bit platform?
> 
> Andrew.


More information about the Java mailing list