This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: thoughts on new reflection data


Tom Tromey wrote:
Please comment on this.

I've looked into representing new reflection data a little bit.
(Some of the "new" reflection data is actually old stuff we haven't
supported, like getDeclaringClass)

The full list of reflection data we need to support is:

* Inner class info (for getDeclaredClasses and getDeclaringClass)
* Enclosing method info
* Generic signature (for classes, fields, and methods)
* Annotations (likewise)
* Parameter annotations (for methods and constructors)
* Annotation defaults (for annotation methods)

The naive approach of course would be to add new fields to
_Jv_Method and _Jv_Field to hold things like the generic signature.

However using jcf-dump I looked at the contents of libgcj.  We have
74354 instances of "Descriptor:" (meaning a class, field, or method
descriptor) but only 1722 generic signatures.  So, simply adding
fields would involve a lot of bloat.

I think we can get by with much less overhead -- say 2 pointers per
class, plus the size of the reflection data itself.

First we would have a byte array that holds the reflection data
encoded in a .class-like way.  I say ".class-like" because we would
want to hold all reflection data in a single array, so the
method/field/etc to which a piece of data corresponds would have to be
encoded in the array itself.  One nice property here is that the
resulting data is pointer-free.

As a simple example, where in the JVM spec the Signature attribute
looks like:

{ u2 attribute_name_index;
  u4 attribute_length;
  u2 signature_index;
}

... in our encoding it would look like:

{ u1 object; // field, method, class, or 'end of table'
u1 kind; // from an enum that lists all the attributes we care about
// the compiler or class reader will simply ignore // attributes not on this list
u2 attribute_length;
u2 object_index; // index into the field or method table
// we can omit this for 'class' object
u2 signature_index; // index into the constant pool
}



Second we would have a single hashmap per class which maps from Method/Field/Class to the reflection data for that object. This hashmap would be created and filled in lazily, so that we only incur overhead when asked. In most cases it would remain NULL over the class' lifetime.

The expense of instantiating annotations is why I think that caching
is in order.  Also, an annotation instance is immutable, so sharing it
makes sense.  As this is an implementation detail we can play with it
a bit -- disable caching, or have the cache use SoftReferences,
whatever.

This sounds reasonable, but I have a couple of questions and/or suggestions:


Would it be possible or useful to move the name and signature members of _Jv_Field and _Jv_Method into these attributes?

Can the signature_index be the an offset from the beginning of all the utf8*. I think the linker combines identical utf8 constants and groups them all together. It would probably have to be type u4 instead of u2.

It might be nice to compress all the signatures/names, but that would make lookup more difficult and would probably require linker support.

Since most reflection data is never used (with the exception of that needed for interface dispatch table construction), compressing it might be a win.

David Daney


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]