This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: class metadata (was Re: GCJ information)


Bryce McKinlay wrote:

> Per Bothner wrote:
>> I've done a little thinking about this.  I suggest *not* decompressing
>> during initialization, but deferring it until we actually need it,
>> for reflection, JNI, or serialization.  
>  
> For my binary compatibility patch, they must be decompressed 
> before/during class initialization. Also the interpreter needs them, as 
> does the interface table layout code (though that could/should be moved 
> to compile time for the --no-binary-compatibility case).


I'm thinking that the compressed format be readable without
de-compression.  It just would be less efficient.  I.e the
format would be ok for sequential reading, and class initialization,
but not necessarily 'GetField'.

I think a primary goal of any compression is to reduce start-up
overhead.  That among other things means reducing relocation.

I think we can get some of the advantages of compression by just making
_Jv_Field and _Jv_Method variable-sized, and moving any pointers out.

How about something like this:

static Utf8Const *_NameTable[] { &_Utf_foo, &_Uft_bar, ... };
static void *_TypeTable[] { &Utf_java_lang_String + 1, ... };

These are both per-compilation unit.  Each element of _NameTable
points to a Utf8Const, which can be shared using existing linker
magic. Each element of _TypeTable initially points to the Utf8Const,
but has the high bit set; after type resolution, it points to a Class.

Each Class contains a pointer to its _NameTable, its _TypeTable,
and as before a _Jv_Field and _Jv_Method array.  But those arrays
are actually just byte-arrays, in read-only pointer-free memory.
Method/field names/types are indexes into the class's _NameTable
or _TypeTable.  In most cases these offsets can be two bytes.

So a _Jv_Field would usually be 1-2 bytes for type index + 1-2
bytes for name index + 1 byte for offset + 1 byte for flags
i.e. ~5 bytes.  If the name and type are only used once, we need
to add 4+4 bytes for entries in _NameTable and _TypeTable for a
total of 13 bytes. Contrast this with 16 bytes for the current
implementation.  The savings are more substantial if the
_NameTable or _Type_Table entries can be shared:  Not just the
4 bytes per shared entry, but also removing one relocation.

Everything you can do with the current format you can also do
with the compressed format *however* you also have to pass in
the "context" Class in some cases where you don't have to with
the current encoding.

> I suspect that being able to get rid of unneeded 
> classes will save far more space than compressed metadata!


Yes, that seems likely.


> Other reasons why compressed metadata might be bad:
> 
> _Jv_Methods etc are smaller than mangled C++ symbols because the 
> compiler/linker merge duplicate constants. Thus the same method name in 
> different classes gets merged to the same Utf8Const for example. So, by 
> trading mangled symbols for _Jv_Methods, binaries become smaller.


My suggestion above using a _NameTable and _TypeTable are compatible
with this sharing.


>> typedef java::lang::reflect::Field _Jv_Field;
>> typedef java::lang::reflect::Method _Jv_Method.
>>
> Well, a java.lang.reflect.Method is only needed for reflection, so I'm 
> not sure thats a win either (hmm... maybe for a JIT written in 
> Java...?). It adds a relocation unless you want to set the vtable 
> pointer lazily.


If we did this, the assumption would be that most of the time
you don't actualy need the _Jv_Method.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/per/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]