GCJ 3.4.3 and 3.3 classloading problem
Fri Sep 11 16:48:00 GMT 2009
Hello Hans, thanks for helping us out with this..
> You mean the crash happens in the same GC cycle in which the heap is grown? Or in the next cycle after that? Either way it sounds strange to me. It may be that there is some large object allocation that causes the heap expansion and occurs near the failure. But I'm not sure.
Yes, this is what I'm trying to say.
> What happens if you set the GC_IGNORE_GCJ_INFO environment variable set? Can you run far enough with GC_DONT_GC?
With GC_IGNORE_GCJ_INFO set, the application runs until about 3
garbage collection cycles occur, then it crashes with a
STATUS_ILLEGAL_INSTRUCTION exception (0xc000001d). Running with
GC_DONT_GC, the application runs for a long time without any trouble
at all, until it can't expand the heap anymore.
I should mention that I retrofitted boehm-gc 6.2.6 from our port of
gcj-3.3 to our port of gcj 3.4.3. Recall that gc 6.3.1 is the version
used "out of the box" for libgcj 3.4.1, 3.4.3, 3.4.4, 3.4.5, 3.4.6,
and for the 3.5 snapshots. The reason I did this was it seemed like a
lot less work if I could reuse our gc 6.2.6 port. But maybe this was
not the best choice, since both our gcj 3.3 and 3.4.3 ports exhibit
this problem (although to a much less degree in 3.3). And maybe
something was "lost in translation" when I made the needed changes to
put gc 6.2.6 into gcj 3.4.3. Regardless, it is clear that gc 6.3.1 is
probably the "most tested" version in gcj 3.x series.
It seems unlikely to me that this problem could have existed in the
more vanilla gcj 3.4.x series, so I am tempted to restore a port of gc
6.3.1 to my libgcj build and start there. Might this be the best way
to proceed? Can you remember if there are significant differences
between gc 6.2.6 and 6.3.1?
> In the end, this may require brute force debugging. Find the object that was corrupted/collected early, and then follow the chain of objects from a root checking which ones are marked, so that you can identify where a link wasn't followed correctly. This is unfortunately much easier if you can get the process to loop at the point of failure, so that you can call GC_is_marked() from the debugger.
More information about the Java