This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFA PATCH: Fix PosixProcess some more...


Bryce McKinlay wrote:
> Hi David,
> 
> Nice catch. The GC uses signals internally to suspend other threads 
> while the GC is running (see boehm-gc/pthread_stop_world.c).  The 
> symptom you describe would happen if one thread did not enter its signal 
> handler because that signal was blocked at the time the GC sent it. I 
> guess this happens because sigwait() blocks any signals that arn't in 
> the set you pass to it, and/or doesn't run their regular handlers.

It is unclear to me what sigwait() is supposed to do as it does
different things for me on different platforms.  My Red Hat linux man
pages specifically say that it does not run the signal handlers.  This
is a problem for boehm-gc as it needs the signal handler to be run.

In my first version of the patch (which I didn't post) did a
kill(thisthread, whateversignal) after the sigwait() so the GC's signal
handler would get run.  However this technique resulted in other
failures on the NPTL system.  Thus the sigsuspend() version which seems
to work on both.

> 
> The change is fine, but shouldn't waitForSignal() call sigsuspend() in a 
> loop, and only return if the signal it got is a SIGCHLD? Otherwise, it 
> seems that the GC's (or any other) signals will also wake it up. I guess 
> the run() loop will just call waitForSignal() again, but it might be a 
> bit inefficient if signals are being frequently delivered for some 
> reason.

The loop is fairly tight, it really just contains sigsuspend() and
waitpid().  Also when using sigsuspend(), you cannot tell which signal
fired, so we have to do the waitpid() anyhow.

> Also, it would be good to add a comment in waitForSignal() 
> noting the interaction with the GC and why sigsuspend() is neccessary.

OK.

> 
> I'm not sure the test case is acceptable if it takes 30s to run, 
> however. We want the test case to be as fast and convenient to run as 
> possible for GCC developers, and adding such a long fixed delay is 
> probably not good.
> 

There is a reason that the test case does this.

The main failure mode on linuxthreads was GC getting blocked.  The only
way I know to test for this is to have the subprocess take a
considerable amount of time, and measure how long gc takes.  I can
reduce the time a little, but I think it should be at least 10 seconds.

I will prepare new versions of the patches now.

David Daney.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]