Bug 33218 - Process.waiFor() Process.destroy() misbehave for childs which are not reacting to Ctrl+C SIGQUIT
Summary: Process.waiFor() Process.destroy() misbehave for childs which are not reactin...
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: java (show other bugs)
Version: 4.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-28 09:26 UTC by Alexandre Rusev
Modified: 2009-01-02 06:47 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Test case that works. (249 bytes, text/plain)
2007-08-29 00:48 UTC, David Daney
Details
New test.sh (54 bytes, application/x-shellscript)
2007-08-29 00:49 UTC, David Daney
Details
test.java (322 bytes, text/plain)
2007-08-30 04:04 UTC, Alexandre Rusev
Details
script bt_connect.bash (259 bytes, text/plain)
2007-08-30 04:05 UTC, Alexandre Rusev
Details
one more helper script bt_param.bash (107 bytes, text/plain)
2007-08-30 04:06 UTC, Alexandre Rusev
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexandre Rusev 2007-08-28 09:26:27 UTC
When creating process (Process p = ...) which do not respond to Ctrl+C
then behavior of destroy or waitFor or both is incorrect.

Process blocking/discarding signal sent by Ctrl+C is not killed by destroy().
(The Process.destroy() supposed (IMHO) to kill the child process forcibly.)

After calling destroy() other method waitFor() returns immediately instead of
waiting (survived after the signal) process completion forever.

Such behavior looks to be such  a discrepancy.


Test case:
I encounted the problem when used following command line
rfcomm listen -i hci0 /dev/rfcomm0 6

rfcomm doesn't react to Ctrl+C till external connection is accepted.

Program code:

Process p =    <rfcomm ...>
p.destroy();
p.waitFor()
System.out.println("waitFor completed");





ps ax | grep rfcomm


Workaround for application:
kill such processes explicitly with "kill -9" or
kill subchildren of shell scripts by themselves using shell "trap" command.


Not sure that the issue exists in 4.2.x, I haven't one to test with architectures I use.
Comment 1 David Daney 2007-08-28 10:29:39 UTC
Can you post a fully self contained test case?  If I can easily reproduce it, I will try to fix it.
Comment 2 Alexandre Rusev 2007-08-28 11:13:36 UTC
(In reply to comment #1)
> Can you post a fully self contained test case?  If I can easily reproduce it, I
> will try to fix it.
> 

Test case is to be following, but reproducing looks like to be a bit tricky :(
gcj (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux) doesn't show such behaviour,
I'll try (in a few days) this once more with my real scripts and real hardware platfrom that compiler. Then I'll post more detailed report to this bug.

May be the problem is observed only when script is sleep in syscall only.
Yet I could kill script manually, so it might not be a kernel issue.




test.java

import java.lang.*;

public class test{

public static void main(String[] args) throws Exception{
 String cmd = "./test.sh";
 Process p = Runtime.getRuntime().exec(cmd);
 p.destroy();
 p.waitFor();
 System.out.println("waitFor completed");
 while(1 == 1){
  Thread.currentThread().sleep(3000);
 }
}


}


test.sh:


#!/bin/bash


trap "" SIGINT
trap "" SIGQUIT



Thanks.



Comment 3 David Daney 2007-08-28 11:26:20 UTC
Looking at the current code, it seems that we may have a problem if we destroy() a process that has already exited.  The kill(2) man page suggests that ESRCH could result, in which case we would throw an InternalError.  Must investigate...
Comment 4 David Daney 2007-08-29 00:48:54 UTC
Created attachment 14129 [details]
Test case that works.

With the new "Test case that works" and attached test.sh and the original test.sh I get no failures on x86_64-pc-linux  (FC6)  with: gcj (GCC) 4.3.0 20070728 (experimental) 

I am inclined to mark the bug as Works-for-me in a few days if it cannot be reproduced on the trunk.
Comment 5 David Daney 2007-08-29 00:49:45 UTC
Created attachment 14130 [details]
New test.sh
Comment 6 Alexandre Rusev 2007-08-29 04:46:49 UTC
(In reply to comment #4)
> Created an attachment (id=14129) [edit]
> Test case that works.
> 
> With the new "Test case that works" and attached test.sh and the original
> test.sh I get no failures on x86_64-pc-linux  (FC6)  with: gcj (GCC) 4.3.0
> 20070728 (experimental) 
> 
> I am inclined to mark the bug as Works-for-me in a few days if it cannot be
> reproduced on the trunk.
> 
Ok, that's need more investigation.
I'll be working on my project again (choosing workaround) in a few days,
I'll investigate in more detail the situation when I see the issue at both architectures.

I no mean to reproduce is found (till the middle of the next week) then the bug may be reopened as a good testcase would be produced.

Thank you for looking through current code.
In GCJ 3.x.x we had a couple of troubles with zombie childs may be some nonetrivial case is not swept pou yet.

Comment 7 Alexandre Rusev 2007-08-30 04:04:04 UTC
Problem is reproducible, but it likely should be posted to other list.
It looks that behaviour of particular utility "rfcomm" is such specific that
it not only ignores some signals but also forks one more child in detached state.

Here how I do reproduce it:

As root do

1. attach my USB bluetooth dondle and bring it  up "hciconfig hci0 up"
2. run "./test"
3. run "ps ax | grep rfcomm"
Very soon I see that the only instance of rfcomm is running (PID is not changed) and instances of rfcomm started later are exiting because interface hci0 is busy.



Some processes still need SIGINT due to notify their children before exiting or be killed with other signal such as -9

The bug, posted here looks like needs resolution as INVALID.
Oh, I terribly sorry

What do you think?


Comment 8 Alexandre Rusev 2007-08-30 04:04:58 UTC
Created attachment 14139 [details]
test.java

test.java to run with bt_connect.bash
Comment 9 Alexandre Rusev 2007-08-30 04:05:52 UTC
Created attachment 14140 [details]
script bt_connect.bash

script to use with 14139: test.java
Comment 10 Alexandre Rusev 2007-08-30 04:06:38 UTC
Created attachment 14141 [details]
one more helper script bt_param.bash

helper script for  14139: test.java
Comment 11 Alexandre Rusev 2007-08-30 04:11:41 UTC
It looks that the fact that, rfcomm in some situations are killed when shell script called with Proces.destroy() and in some situations don't
misleded me.
Also the strace shows that rfcomm sleep inside accept system call.
Comment 12 David Daney 2007-08-30 12:14:09 UTC
Does GCJ's behavior differ from Sun's in this test?
Comment 13 Alexandre Rusev 2007-08-31 04:12:46 UTC
(In reply to comment #12)
> Does GCJ's behavior differ from Sun's in this test?
> 

Well.. tried that (jdk1.6 i386)
Answer is: at this point NOT. So this is "not an issue"

But while performing this test I found a slight difference
in treating of output streams of process for which the waitFor returned.

GCJ-compiled program may use output stream of such process: use available(),
read(), readLine() e.t.c.
In out case readLine() returns null

While JDK-compiled class running in SUN's JVM (both are 1.6 version) throws exception:
Exception in thread "main" java.io.IOException: Stream closed
        at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
        at java.io.BufferedInputStream.available(BufferedInputStream.java:381)
        at java.io.FilterInputStream.available(FilterInputStream.java:142)
        at test.main(test.java:19)


This looks like to be a slight deviation from standard in GCJ :(

But that's NOT the problem I stated initially and is to be dealt in othe bug (IMHO)

When running shell script the rfcomm program replaces it in process list
(accordingly POSIX or something likes that) so I always erroniously considered
not killed rfcomm as NOT killed bt_connect.sh

I feal myself ashamed, I apologize once again ;)

Comment 14 Laurent GUERBY 2009-01-02 06:47:50 UTC
Removing host