This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: performance problem with process fork in gcj compiled CNI

From: "Ricardo Temporal" <ricardotemporal at hotmail dot com>
To: hans dot boehm at hp dot com, java at gcc dot gnu dot org
Date: Fri, 27 Jan 2006 23:07:20 +0000
Subject: RE: performance problem with process fork in gcj compiled CNI
Bcc:

Hi,

I realized that the performance problem is not in the software, actually is in the hardware.

In the first scenario I tested in an intel hiperthreading with 2 logical CPUs and linux.

Now I tested in a sparc-sun with 6 real CPUs and solaris.

 Intel:
     1 instance of the process took T CPU cycles
     2 instance of the process took (~1.85).T CPU cycles each

 Sun:
     1 instance of the process took T CPU cycles
     2 instance of the process took T CPU cycles each

I still have a problem to solve about the fork.

I'm thinking abount trying a late link with libgcj.so only after the fork using dlopen.

I don't know if a static linking with libgcj could help me anyway.

Temporal.

From: "Boehm, Hans" <hans.boehm@hp.com>
To: "Ricardo Temporal" <ricardotemporal@hotmail.com>,<java@gcc.gnu.org>
Subject: RE: performance problem with process fork in gcj compiled CNI
Date: Fri, 27 Jan 2006 11:46:26 -0800

> -----Original Message-----
> From: Ricardo Temporal [mailto:ricardotemporal@hotmail.com]
> Hi,
>
>    I saw SUSV3 about the fork and really pthread_atfork
> documentations says:
>
> "There are at least two serious problems with the semantics
> of fork() in a
> multi-threaded program. One problem has to do with state (for
> example,
> memory) covered by mutexes. Consider the case where one
> thread has a mutex
> locked and the state covered by that mutex is inconsistent
> while another
> thread calls fork(). In the child, the mutex is in the locked
> state (locked
> by a nonexistent thread and thus can never be unlocked).
> Having the child
> simply reinitialize the mutex is unsatisfactory since this
> approach does not
> resolve the question about how to correct or otherwise deal with the
> inconsistent state in the child."
>
>   The documentation suggests a workaround using fork handlers
> to be done in
> libgcj and not in my application.
Things are worse than that.  When you fork a multithreaded process, only
one thread exists in the child.  Thus I strongly suspect that some
system threads needed by libgcj will just no longer exist.  I don't see
any a priori reason that the resulting child process should be at all
healthy.  But it appears you were somehow getting lucky, and it's at
least close.
>
>    So I tried to forget the fork and launch 2 instances of
> the program by
> the shell and I've got the same results.
>
>    It seems that the library libgcj.so is shared and synchronized.
>
>    Follow the new version of the program without any fork.
>
>    Please comments.
I have no good explanation for that.  Only the read-only parts of libgcj
should be shared.  There shouldn't really be any synchronization between
the two processes.  Depending on your platform, there may be memory
bandwidth issues or the like, especially since this application does
nothing but allocate and garbage collect.  The usual next step is to use
a profiler and/or performance counter tools to figure out where the time
is going, and why the time spent in each process is so different in the
two cases.  You might also try running with the GC_PRINT_STATS
environment variable defined to see if the garbage collector is behaving
similarly in both cases.

You are presumably talking about two physical processors, one hardware
thread per processor, not two hardware threads (e.g. Intel's
hyperthreading)?  If this is an Opteron-based or other NUMA system,
there may be memory placement issues, though I'd be surprised if this
had that much of an impact.

Hans

Follow-Ups:
- Re: performance problem with process fork in gcj compiled CNI
  - From: David Daney

References:
- RE: performance problem with process fork in gcj compiled CNI
  - From: Boehm, Hans

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]