This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: URLClassloader and native objects


Andrew Haley wrote:
Sal writes:
> Andrew Haley wrote:
> > > There's no code to allow a jarfile to be separately compiled to a .so
> > and loaded automagically, though. It's quite tricky to figure out how
> > to do it.
> > This is one of the things I would like to get to work, once an elegant > solution is found. In a way its a simpler issue in that the code is > already compiled, we just need a way to tell gcj to grab a hold of it > from within a custom classloader.


Tom Tromey suggested that we should create a database of mappings from
checksum->shared object.  So, ClassLoader.defineClass() generates a
checksum and then finds the appropriate shared library and loads it.
Most times, the shared library will already be loaded, so it's only
necessary to return a pointer to th class.  The other tool we need is
one that compiles the jar file and creates the database of mappings.

Another approach that bypasses the checksumming is to attach an
attribute to a jarfile that points to a shared object file that is the
compiled jarfile.  I'm not quite sure what form that attribute would
take, but JFFS2 has xattr().

This sounds agreeable. The only stipulation I can think of is the situation where the .class or .jar containing .class files is not available... lets say you want to use GCJ to build the entire application natively. Then there isn't a bytecoded .class file to load and compute the checksum from.


In these cases we'll need to find a way to have a custom class loader still reference the native objects. We could 'force' users to package JVM bytecode .class files along with their executable so that the checksumming will work but I think there may be a more elegant solution.

For the majority of cases, where we are trying to get existing Java apps to run in a GCJ environment, the checksum would work great... as you have the .class / .jar files onhand already. Just drop GCJ in place of the Sun JVM and the app would run using all native objects. But in the situation where the user wants to use GCJ to build a self contained executable there are these unique problems.

I think one solution may be to modify the GCJ bytecode verification system to accept native code, or references to staticly linked code. It may be a radical idea since most typical Java platforms to date will *only* accept Java bytecode (from defineClass). But I think if we allow this it opens the door up to a native platform without breaking any compatibility with typical Java apps. With a system such as this we could gain access to native objects without having code duplicated as JVM bytecode.

Basically the end result would be that you could (theoretically speaking, in reality you probably use a different naming scheme) rename a .so to .class, and run the application with GCJ... and the bytecode verifier would identify the native code and allow it. Or, if the .so wasn't present (object is statically linked) the .class file could contain a symbol that references a statically linked object. On the other hand if you ran the app under Sun's VM with 'real' .class files it would still work, even in the case of custom classloaders.

Would a modification like this to GCJ be acceptible? Or maybe a configurable option.


> A variation of the issue I'm having, is when the compiled code already > exists but is statically linked into my executable. Everything works > great, and I can even use Class.forName() to grab the object. But my > CCLs are unable to pull this object out, nor load it from the disk > because it is combined into the executable. Do you have any insight on > how I may be able get around this?


I don't really know where the problem lies.  If your custom class
loader inherits from VMClassLoader, when it calls LoadClass it will do
the right thing.

Your previous suggestion would address the problem... but let me explain my issue just in case there is a way around it that I missed.


The class will get loaded, but the classloader instance that loads an object gets associated with that object. And whenever the object itself instantiates another object, then the system classloader gets the request instead of your custom classloader.

For example given a custom classloader 'MyLoader', and classes A, X (in a .class file) Y (in a .class file) and Z (statically linked shared object) this is the situation when trying to load each via a custom classloader:

MyLoader.loadClass("X"); //Custom class loader interprets the request for X
class X { void someMtd(){ ... new Y(); } } // 'MyLoader' gets requests for Y also


class X { ... void someMtd2() { ... new Z(); } } //'MyLoader' gets request for Z, but delegates it to the system classloader because its a native object

class Z { void someMtd() { ... new A(); } } //Request for object 'A' bypasses the custom classloader completely, because the calling class Z is 'tagged' as having been loaded by the system. The Java security models says that future requests from Z goes to Z's classloader.

The result is, A and Z both running from the parent/system classloader, and any requests made by them (via 'new') are handled by the System classloader. The problem here is that any security checking/validation/filtering done by the CCL is bypassed.

This is proper behavior because you are delegating to the system classloader... the issue is that delegating to the system classloader is the only way to access those objects from within your CCL. So assuming that objects A and Z were native objects, and you had no .class file to load them from, your CCL becomes inoperative. If the entire application depends on the functionality of the CCL, then you cannot use GCJ to compile the objects natively, you'll have to run the app in non-native/JIT mode (I can't think of the proper term).

I hope it makes sense, I know its all a bit messy.

Hence the need to be able to loadClass("X"), and from within a CCL return the native equivalent without delegating the request. The checksum matching/attribute bypass solutions you suggest should make this possible, in situations where the .class/.jar is available, and where it isn't, the other solution I proposed could work.

> An ugly hack I can think of would be to have defineClass return some > custom data, such that it doesn't define a class with VM bytecode, but > returns some string that references a native object > (ugly_native_hack://foo.bar.classname). We'de have to bypass any > standard bytecode verification in these special cases and pull that > statically linked image out of storage instead. (Of course this is just > speculation, I don't know enough about GCJ architecture to say it would > be possible.) Has this already been accounted for in some design work > previously?

This doesn't sound like the right thing to do.

If you could consider some of the previous points... while I agree imbedding a text URL like so is probably a bad idea, being able to defineClass using some form of native data seems like the only way to work around some issues. I am open for any ideas for alternative solutions, of course.


> A third solution might be to have some eternal override. Maybe a > directory with SOs or a configuration file that will list objects. Any > objects in this list, when referenced via defineClass will 'thunk' down > to use a staticly compiled version, or a native .so regardless of what > data the application is trying to define the class with.

Yeah, that's more or less Tromey's idea.  That's what we'll go with, I
expect.

Sounds good.


> I'll start digging through code. GCC is still a bit daunting as I'm new > to it, so it may take a little while before I'll be able to > contribute... if you know of some reading offhand (online or off) to > bring me up to speed that would be great.

Don't worry about the compiler.  The library (gcc/libjava) is mostly
Java code.

BTW, there are some legal niceties that we'll need to talk about
before you contribute anything.

Basically, the FSF owns everything right? :) If so this isn't an issue. If you need to talk specifics feel free to drop me a line. (gcj at svf dot dreamhost dot com)


- Sal


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]