sighals and sockets

Gladish, Jacob Jacob.Gladish@netapp.com
Wed Aug 4 20:59:00 GMT 2004



> -----Original Message-----
> From: Bryce McKinlay [mailto:mckinlay@redhat.com] 
> Sent: Wednesday, August 04, 2004 4:43 PM
> To: Gladish, Jacob
> Cc: java@gcc.gnu.org
> Subject: Re: sighals and sockets
> 
> 
> Gladish, Jacob wrote:
> 
> >I've been trying to track down a bug in my code for a few 
> days now and 
> >I was looking at a pretty strange stack. It looked like the 
> >PlainSocketImpl::write would get as far as __libc_write, 
> enter a signal 
> >handler, segv, enter the segv signal handler, then recurse forever 
> >until the process crashed.
> >
> Sounds like write() is segfaulting for some reason, and the segv is 
> being caught by libgcj's segv handler, which converts segv's into 
> NullPointerExceptions.


>From this stack trace, it looks to me as though write gets a SIGPIPE in
9173, then the pthread_sighandler segv's trying to call the signal
handler, which sets off the segv handler in the runtime. 

#9164 0x2ac5106e in uw_frame_state_for (context=0x797fef94,
fs=0x797feed4) at ../../gcc/gcc/unwind-dw2.c:939
#9165 0x2ac51768 in _Unwind_RaiseException (exc=0x429b75a0) at
../../gcc/gcc/unwind.inc:95 
#9166 0x2afd4ab2 in _Jv_Throw (value=0x4008fc30) at
../../../gcc/libjava/exception.cc:100
#9167 0x2afc698a in _Jv_ThrowSignal (throwable=0x4008fc30) at
../../../gcc/libjava/prims.cc:152 
#9168 0x2afc69c2 in catch_segv (_dummy=11) at
../../../gcc/libjava/prims.cc:162 
#9169 <signal handler called> 
#9170 0x00000003 in ?? () 
#9171 0x2b19db2e in pthread_sighandler (signo=13, ctx= 
             {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0,
ds = 43, __dsh = 0, edi = 157, 
              esi = 1095844640, ebp = 2038429524, esp = 2038429476, ebx
= 157, edx = 200, ecx = 1095844640, 
              eax = 4294967264, trapno = 0, err = 0, eip = 727775108, cs
= 35, __csh = 0, eflags = 530, 
              esp_at_signal = 2038429476, ss = 43, __ssh = 0, fpstate =
0x797ff4a8, oldmask = 2147483648, cr2 = 0}) at signals.c:97 
#9172 <signal handler called> 
#9173 0x2b60f784 in __libc_write () from /lib/libc.so.6 
#9174 0x2b19f730 in write (fd=157, buf=0x41514320, n=200) at
wrapsyscall.c:178 


> 
> > It was pretty confusing because the runtime sets an
> >ignore on SIGPIPE and the args going into the write looked fine from 
> >gdb. After some thought, I realized that since the app uses 
> a good deal 
> >of native (CNI) code, it's possible someone else is setting a 
> >sighandler for SIGPIPE. It turns out that the syslog call 
> will do just 
> >that in glibc. It installs a temporary sighandler for PIPE, then 
> >restores whatever was there before returning to the caller. 
> So there's 
> >a race condition that exists between a thread writing to a socket in 
> >the java runtime and a thread calling syslog. The only problem I see 
> >here is that the socket writing thread may inadvertantly close the 
> >syslog connection, which would then be re-stablished the 
> next time someone calls syslog.
> >
> >Would it be safer to use the send(...) call instead of the 
> write(...) 
> >in the socket code?
> >
> I don't think so. send() can potentially generate a SIGPIPE just like 
> write(), according to my glibc docs.
> 

The send() manpage mentions a MSG_NOSIGNAL option. I was under the
impression that it pretty much ment no SIGPIPEs. It's not very clear as
to whether that's the case or if there are other cases which may
generate the signal that are not covered by the option.


> >I still have not discovered why the application has crashed. 
> But here's 
> >a snippet of the stack. This is 3.3.1
> >
> >-jake
> >
> >
> >(gdb) bt -40
> >#9158 0x2ac5106e in uw_frame_state_for (context=0x797fea3c,
> >fs=0x797fe97c) at ../../gcc/gcc/unwind-dw2.c:939
> >  
> >
> 
> It looks like it is crashing in the libgcc unwinder itself (trying to 
> throw the NullPointerException), hence the infinite 
> recursion. A problem 
> with the dwarf2 unwind info on this platform, perhaps? Do the various 
> exceptions tests in the libjava test suite run ok for you?
> 
> Regards

I'll run the test suite and see if that produces anything strange.


> 
> Bryce
> 
> 



More information about the Java mailing list