Mon Dec 6 13:31:00 GMT 2004
On Mon, 2004-12-06 at 12:15 +0000, Andrew Haley wrote:
> Ouch! Compile with -ffast-math and get some speedup.
I get a 18% wall-clock time improvement with gcc 3.4.3:
[mlacage@chronos treegrowth]$ time -p ./native/MainSimulation --max-
I get similar results with gcc 4.0.0 which brings the performance of
code generated by 4.0.0 on par with 3.4.3 (modulo the measurement noise
which seems to be around 0.1s).
> > 10.19 2954.00 663.00 Jv_LookupInterfaceMethodIdx
> It's interesting, and perhaps a little surprising, that interface
> dispatch occupies such a large proportion of your runtime.
> > 7.79 3461.00 507.00 ZN4java4util9ArrayList3getEi
> > 4.98 3785.00 324.00 Z20_Jv_IsAssignableFromPN4java4lang5ClassES2_
> > 4.92 4105.00 320.00 Jv_CheckCast
> You're doing a lot of access to generic contatiners, so there's a lot
> of cast checks.
I use -fno-bounds-check -fno-store-check. Would it be possible to
disable more checks for my production builds ?
> > 4.33 4387.00 282.00 frame_dummy
> > 3.37 4606.00 219.00 ZN4java4util14AbstractList$17hasNextEv
> > 3.30 4821.00 215.00 ZN4java4util14AbstractList$18checkModEv
> > 3.01 5017.00 196.00 GC_local_gcj_malloc
> > 2.47 5178.00 161.00 ZN4java4util14AbstractList$14nextEv
> > 2.34 5330.00 152.00 ZN4java4util9ArrayList3addEPNS_4lang6ObjectE
> > 2.08 5465.00 135.00 ZN4java4util9ArrayList19checkBoundExclusiveEi
> > 1.48 5561.00 96.00 Jv_AllocObjectNoFinalizer
> > 1.48 5657.00 96.00 Jv_CheckArrayStore
> > 1.38 5747.00 90.00 init
> > 1.14 5821.00 74.00 ZN4java4lang4Math3logEd
> > 1.08 5891.00 70.00 ZN4java4util12AbstractList8iteratorEv
> > Ok, so, my application does a lot of calls to the log function, no
> > surprise here. Now, I expected the GC to be pretty high here and, well,
> > it seems to be with GC_mark_from. However, I must say I am pretty
> > surprised to see Jv_LookupInterfaceMethodIdx which I don't know anything
> > about.
> > Could someone tell me what this function really does ? Is it expected to
> > be so high into an application profile ? If not, what could I do to
> > reduce its usage ?
> You're doing a great many interface calls. I don't think gcj's
> interface dispatch is particularly slow, so interface dispatch might
> be just as significant a drain on runtime on other systems. You might
> get better performance by using ArrayList in your code instead of
I do use ArrayList already. ZN4java4util14AbstractList$17hasNextEv and
ZN4java4util14AbstractList$14nextEv look like the iterator methods
hasNext and next.
> As far as I'm aware the Boehm gc is fine, but we aren't taking
> advantage of the opportunity quickly to recycle very short-lived
> objects. This is something that we can improve. But to have gc use
> only 10% in an application really isn't tragically bad.
Well, this is for really short simulations: the overhead of the gc
increases with the length of the simulations because I build a data
structure whose size always increases. I have measured a (~2/3)% wall-
clock time usage for the GC in similar situations with the sun JDK 1.5.0
and a rather naive profiler I wrote (It should be noted that the results
of the 1.5.0 jdk are not as good as the 1.4.3 jdk and they are close to
the results of gcc 3.4.3 without -ffast-math).
An interesting optimization which is clearly impossible with the sun JDK
would be to mark this huge growing data structure as outside of the
scope of the GC. I have no idea whether or not this is even remotely
possible with gcj (hint: I am willing to follow any advice which gives
ideas on how to do this, even those including hacking gcj itself).
More information about the Java