This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: class metadata (was Re: GCJ information)
Bryce McKinlay wrote:
> Per Bothner wrote:
>> I've done a little thinking about this. I suggest *not* decompressing
>> during initialization, but deferring it until we actually need it,
>> for reflection, JNI, or serialization.
>
> For my binary compatibility patch, they must be decompressed
> before/during class initialization. Also the interpreter needs them, as
> does the interface table layout code (though that could/should be moved
> to compile time for the --no-binary-compatibility case).
I'm thinking that the compressed format be readable without
de-compression. It just would be less efficient. I.e the
format would be ok for sequential reading, and class initialization,
but not necessarily 'GetField'.
I think a primary goal of any compression is to reduce start-up
overhead. That among other things means reducing relocation.
I think we can get some of the advantages of compression by just making
_Jv_Field and _Jv_Method variable-sized, and moving any pointers out.
How about something like this:
static Utf8Const *_NameTable[] { &_Utf_foo, &_Uft_bar, ... };
static void *_TypeTable[] { &Utf_java_lang_String + 1, ... };
These are both per-compilation unit. Each element of _NameTable
points to a Utf8Const, which can be shared using existing linker
magic. Each element of _TypeTable initially points to the Utf8Const,
but has the high bit set; after type resolution, it points to a Class.
Each Class contains a pointer to its _NameTable, its _TypeTable,
and as before a _Jv_Field and _Jv_Method array. But those arrays
are actually just byte-arrays, in read-only pointer-free memory.
Method/field names/types are indexes into the class's _NameTable
or _TypeTable. In most cases these offsets can be two bytes.
So a _Jv_Field would usually be 1-2 bytes for type index + 1-2
bytes for name index + 1 byte for offset + 1 byte for flags
i.e. ~5 bytes. If the name and type are only used once, we need
to add 4+4 bytes for entries in _NameTable and _TypeTable for a
total of 13 bytes. Contrast this with 16 bytes for the current
implementation. The savings are more substantial if the
_NameTable or _Type_Table entries can be shared: Not just the
4 bytes per shared entry, but also removing one relocation.
Everything you can do with the current format you can also do
with the compressed format *however* you also have to pass in
the "context" Class in some cases where you don't have to with
the current encoding.
> I suspect that being able to get rid of unneeded
> classes will save far more space than compressed metadata!
Yes, that seems likely.
> Other reasons why compressed metadata might be bad:
>
> _Jv_Methods etc are smaller than mangled C++ symbols because the
> compiler/linker merge duplicate constants. Thus the same method name in
> different classes gets merged to the same Utf8Const for example. So, by
> trading mangled symbols for _Jv_Methods, binaries become smaller.
My suggestion above using a _NameTable and _TypeTable are compatible
with this sharing.
>> typedef java::lang::reflect::Field _Jv_Field;
>> typedef java::lang::reflect::Method _Jv_Method.
>>
> Well, a java.lang.reflect.Method is only needed for reflection, so I'm
> not sure thats a win either (hmm... maybe for a JIT written in
> Java...?). It adds a relocation unless you want to set the vtable
> pointer lazily.
If we did this, the assumption would be that most of the time
you don't actualy need the _Jv_Method.
--
--Per Bothner
per@bothner.com http://www.bothner.com/per/