Language extensions ?

Robin Garner robin.garner@iname.com
Fri Mar 28 05:14:00 GMT 2003


> >>>>> "Robin" == Robin Garner <robin.garner@iname.com> writes:
> 
> Robin> I'd like to ask whether the gcj team is amenable at all to the
> Robin> idea of adding some language extensions that would make system
> Robin> programming in java possible/easier.
> 
Tom> The GCC experience with language extensions in C and C++ seems to be
Tom> that they have to be very well defined and very useful to be worth the
Tom> maintenance effort.  I think any extension would have to have a really
Tom> compelling reason, plus documentation explaining it very carefully --
Tom> another C/C++ lesson is that underspecified extensions bite us later
> on.

I guess this is what I hoped you'd say ... for anyone writing a JVM or garbage collector, the extensions I'm after are probably extremely valuable, but I'll know the exact performance cost in a couple of months.

I agree entirely that any extension must be very well defined and documented.  There are 2 good examples to work from so we wouldn't be working in the dark, but great care is clearly required.  The Jikes RVM and OpenVM JVMs both have similar "magic", so the set of useful/essential operations is fairly well defined.

> Robin> Jikes RVM provides a "magic" mechanism, specifically there is a
> Robin> VM_Address class, which looks from a java source point of view
> Robin> as an object with a single private int field, with methods for
> Robin> doing arithmetic, comparison etc.  At runtime, the jit compiler
> Robin> intercepts object instances and methods and directly replaces
> Robin> them with the appropriate machine code for dealing directly
> Robin> with addresses.
> 
Tom> How does this interact with the security model?

It breaks it, naturally.  Jikes RVM only does this transformation when it knows it is compiling the JVM itself, so the potential for damage is limited.  I don't see this as a big problem, because the only alternative is to use native methods, which can do anything they damn well please and break the security model in any case.

The Modula-3 and C# approach is to identify certain modules/classes as 'unsafe', and unsafe modules are allowed to use unsafe features, and break the safety/security model, but must provide a 'safe' interface.

Tom> I suppose it would be possible to extend RawData to act as a generic
Tom> pointer.  But it seems like a lot of work, and special cases all over
Tom> the place (e.g., the interpreter).  Plus, given the existence of CNI,
Tom> the gain seems pretty limited.

The gain for a garbage collector is mostly in the write barrier.  For a generational collector, the compiler must emit additional code at every pointer store, to detect whether this is an old-to-new generation pointer, generally using a mask and/or comparison on the value of the pointer.

A basic write barrier looks something like this

  private final void writeBarrier(VM_Address src, VM_Address tgt)
    throws VM_PragmaInline {
    if (src.LT(NURSERY_START) && tgt.GE(NURSERY_START)) {
      remset.insert(src);
    }
  }

where NURSERY_START is a VM_Address constant.  The bogus exception is Jikes RVM's method of forcing the compiler to inline.

Given the frequency of pointer updates (I haven't the exact figure to hand, but I think it's a static frequency of around 3%), this code needs to be as few instructions as possible.  A method call of any kind is just not going to be efficient here, and there's a noticeable (>10%) performance improvement from inlining "WriteBarrier", and ensuring that "remset.insert" isn't inlined.

> Robin> I guess another question is whether gcj would be interested in
> Robin> using JMTk for memory management ?  At the moment there isn't a
> Robin> conservative collector (like boehm) in gcj, but is that
> Robin> absolutely necessary ?
> 
Tom> We still need something that can conservatively scan the stack, since
Tom> we don't have any way to annotate the stack (or registers) with type
Tom> information.  This would require nontrivial compiler changes.  Copying
Tom> collectors also face a big challenge here.
> 
Tom> A mostly-copying collector may work, though I don't think anybody has
Tom> tried that with gcj.

Glasgow Haskell has the same issue (uses the gcc back end); I can't remember the exact solution they found, but maybe it'll be applicable.  I expect to know it intimately in a couple of months, so I might be able to contribute here.  

cheers

-- Robin
-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup



More information about the Java mailing list