This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
RE: VolanoMark findings
- From: "Boehm, Hans" <hans_boehm at hp dot com>
- To: "'Andrew Haley'" <aph at redhat dot com>, Anthony Green <green at redhat dot com>
- Cc: java at gcc dot gnu dot org
- Date: Wed, 29 Jan 2003 09:08:57 -0800
- Subject: RE: VolanoMark findings
That's very roughly consistent with my (slightly dated) SPECjbb experiments. An Itanium profile suggested the top culprits there were:
1. _Jv_MonitorEnter (about 7%)
2. GC_mark_from (probably unavoidable and no worse than on other JVMs with the same heap size)
3. The B-tree access routine in the benchmark
4. _Jv_MonitorExit (about 5%)
5. _Jv_CheckCast
out of line division (> 5% total, in various routines)
memory allocation (not GC)
a few poorly tuned java.lang.String routines
__gettimeofday (about 2% !?)
The most likely sources of possible generally applicable improvement seemed to be:
1. (Selective or partial?) inlining of MonitorEnter/Exit. Being able to remember some state between the two would help appreciably. (But there's a limit, since the compare-and-swap instructions themselves use up a significant fraction of the time on most platforms. It would be great to avoid that in the MonitorExit case, but I think it's quite hard, given our other constraints.)
2. (Selective?) inlining of division.
3. Improvements to some of the String routines. (I may still have a partial patch or two hanging around, though it may be obsolete. I got sidetracked ...)
4. Further shortening of the allocation path.
5. (Almost certainly the most important, though the hardest) Gcc optimizer improvements.
In general, I think these benchmark results are too pessimistic, since other JVMs tend to be tuned for them to a much larger extent than gcj. But they still provide useful information.
Hans
> -----Original Message-----
> From: Andrew Haley [mailto:aph@redhat.com]
> Sent: Wednesday, January 29, 2003 7:34 AM
> To: Anthony Green
> Cc: java@gcc.gnu.org
> Subject: VolanoMark findings
>
>
> Anthony Green writes:
> > I recently retried building the VolanoMark benchmark found here:
> > http://www.volano.com/brenchmarks.html .
> >
> > The good news is that it finally builds, and I closed the
> case against
> > this problem. I have no idea what the magic fix was.
> IIRC the compiler
> > couldn't handle the exception regions in the obfuscated
> class files.
> >
> > The bad news is that IBM's JDK is twice as fast on this
> benchmark than
> > an optimized gcj build.
>
> That's the same as I measured with Embedded CaffieneMark.
>
> > My 2.3 GHz P4 gives IBM's 1.4 JDK a score of 12058, while we come
> > in at half that: 6040.
> >
> > I'm hoping that this may be mostly accounted for bugs.
> Unfortunately,
> > the VolanoMark is only distributed in .class form, so
> figuring this out
> > may take some doing.
>
> We already know what IBM do to get this perfomance:
>
> http://www.research.ibm.com/journal/sj/391/suganuma.html
>
> * Method inlining. We do that, but only in special cases.
>
> * Exception check elimination. We don't do that.
>
> * Common subexpression elimination. We that.
>
> * Removal of initialization checks.
>
> * Removal of synchronization.
>
> Andrew.
>