This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: segfault in sysdep/i386/backtrace.h


Marco Trudel wrote:
Andrew Haley wrote:
Marco Trudel writes:
> Marco Trudel wrote:
> > Andrew Haley wrote:
> >> Marco Trudel writes:
> >> > > > The segfault happens on reading scan_bytes[x]. I assume that > >> there is no > "pushl %ebp; movl %esp, %ebp" function prologue in > >> certain cases and > thus we go reading protected areas below the > >> function.
> >>
> >> Why don't you have a look, and tell us what is there?
> > > > Because I don't know how and what these hex values mean (how to > > interpret them) when doing the backtrace...
> > Ok, learnt it...
> The problem is that the code assumes that there is always a "pushl %ebp; > movl %esp, %ebp" function prologue. But, from [1]: "Note that many > compilers can optimize these standard sequences away when not needed > (often called "no stackframe generation")".
> > So, when turning on maximum optimization in microsoft visual c++, there > are no longer "pushl %ebp; movl %esp, %ebp" intros and thus we run into > trouble (tried it). I don't know if GCC can do that too... Can it?


It can.

> I checked a couple of dll's (awt.dll, swt.dll, aBluetoothLib.dll) I had > around and they all miss the intro in at least a couple of functions.
> > So, questions:
> - Is this a sjlj-exception only problem?


Yes.

> Can DW EH do that better?

Yes.

> - Is there another way to reliably recognize the start of a function? I > assume this only affects native libs since Java compiled apps will > always have the intro?!

Yes. We tell gcj not to optimize away the frame generation.

We either have to write a ton of heuristics to figure this stuff out
or fix DWARF / SEH in Windows.

Well, I think we should go for DWARF. Last I heard from Danny was that it worked already but then was broken again for building gcc. Since then, I never got an answer from him again.
So, for the mean time we have two options for mingw:
1. Tell users to only use dlls with the entry sequences.
2. Fix gcj to not rely on them.

Well, actually we have another option. I forgot the most - IMHO - obvious and elegant one:
If the entry point is always there in our code but libraries can't be trusted; why not just ignore libs?
We can easy differentiate our code from libraries because they start in a defined region in the memory image. For Linux this is 0x40000000 and for mingw, this is 0x1000000 (one 0 less)*.


I attached a patch that seems to do that for mingw. My application works again with it. But I'm not an expert on this topic, so I could be completely wrong or don't know cases where it might fail.

Also, jni calls will go through their stub anyway, right? So the trace should still be correct because they contain the stub?!
What about cni? How does it work?



* Are these values correct? Are there other operating systems to consider? The Linux value is from a computer systems book and the mingw one from [1]. It looks suspicious that they start so much earlier on mingw than on Linux. But my GDBing backs that up.



Marco



[1] http://www.swox.com/list-archives/gmp-discuss/2005-September/001827.html
Index: sysdep/i386/backtrace.h
===================================================================
--- sysdep/i386/backtrace.h	(revision 122072)
+++ sysdep/i386/backtrace.h	(working copy)
@@ -79,6 +79,17 @@
          information based exception handling.  */
       ctx.meth_addr = (_Jv_uintptr_t)NULL;
       _Jv_uintptr_t scan_addr = (ctx.ret_addr & 0xFFFFFFFE) - 2;
+      
+      /* Make a difference between libraries and our own code. Libraries can't
+         be trusted to have the "pushl %ebp; movl %esp, %ebp" function entry
+         point. Since libraries are mapped to a specific region, we can know
+         when we run into one and can just skip it. */
+      if(scan_addr >= 0x1000000)
+      	continue;
+      //FIXME: Is this a safe attempt? Also this is only for mingw.
+      // Other OS can or have to use SJLJ EH too.
+      // What to do for them? Linux should be 0x40000000, right?
+
       _Jv_uintptr_t limit_addr
         = (scan_addr > 1024 * 1024) ? (scan_addr - 1024 * 1024) : 2;
       for ( ; scan_addr >= limit_addr; scan_addr -= 2)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]