need to focus on java performance?

Wed May 24 15:42:00 GMT 2006

Andrew Haley wrote:
> Bryce McKinlay writes:
>  > Andrew Haley wrote:
>  > > That would be nice.  It's fairly easy to patch call sites (that's what
>  > > ld.so does) to remove all the Jv_initClass calls but hard to do it
>  > > portable because of problems with locking.
>  >
>  > ld.so doesn't patch call sites themselves, just the PLT entries.
>
> Of course not, because the text section isn't usually writable.
> However, the technique of overwriting instructions is indeed used by
> ld.do.
>
>  > It should to be possible for libgcj's linker to make atable entries
>  > initially point to trampolines. The trampoline would call
>  > _Jv_InitClass, and _Jv_InitClass would update all the atable
>  > entries for that class, substituting the trampolines for real
>  > functions. Because class initialization acquires the class lock,
>  > there shouldn't be any locking issues except perhaps on arch's with
>  > very weak memory ordering.
>
> When you jumped to a static method for the first time, you'd hit the
> stub.  That stub would jump to a routine for initializing the class
> and then rewriting all the stubs for that class.  At that point, there
> are potentially several threads in progress.  One of these threads
> would "win", and the others would have to wait until the class
> initialization was complete.  So far, so good.
>
> Once the class was initialized, you'd need to rewrite the instructions
> in the trampolines to point to the static methods that don't do class
> initialization.  While all this was going on, there still would be
> threads jumping to these trampolines.  However, these trampolines are
> not themseolves protected by locks, and there's no guarantee that you
> can rewrite the instructions atomically.  So, you'd run the risk of a
> partially rewritten instruction being executed.  If you can rewrite
> the trampolines atomically then I think you might be OK.
>
>   
One thing that just occurred to me was that one could queue up the 
trampoline rewrites and do them all as a batch when the GC has stopped 
the world.  In this case all relevant threads would be parked safely in 
their stop-the-world signal handlers.  The trampolines would have to be 
structured in such a manner that only a single machine instruction would 
be changed, or were somehow atomic with respect to the GC's signals.

> I'm happy to maintain that this is hard to do portably.
>   
Most likely impossible to do portably.  But it is not unprecedented to 
have non-portable code in GCC's runtime support.  Look at 
md_fallback_frame_state or the locking code for example.

David Daney