This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: progress on method-gc / mangling questions


Adam Megacz wrote:

>Okay, questions first, then an explanation of why I'm asking them...
>
>  - Is there any documentation on the java symbol mangling procedure?
>    Are there any guarantees about its stability in future releases?
>    (I would guess that there are no guarantees, but I could be wrong).
>
The C++ ABI mangling is documented at 
http://www.codesourcery.com/cxx-abi/abi.html#mangling

Java uses a subset of this. The mangling of function names is not likely 
to change, barring any bugs in our implementation. One special case is 
that int[] is mangled as JArray<int>. It would be nice to have a 
(de)mangler implementation in Java so let us know if you write one ;-)

>- If there is no documentation, can anybody tell me the meaning of
>    these symbol prefixes? I was able to guess the others.
>
>       __CT_
>       __IF_
>       __CD_
>       __CT_
>       __FL_
>       __MT_
>

Note that these ones are private/local symbols that are part of the 
Class data, I doubt you should ever refer to them directly.

>- My binaries have a lot of sections named __Utf<n> where <n> is
>    some number. What are these sections? String constants?
>
I don't know why they would be in different sections. The __Utf<n> 
symbols are Utf8Constants. These are Utf8 strings with hashcode and 
length attached. We use these to store reflection data strings and also 
string constants. We use Utf8Consts instead of pre-initialized String 
objects because a string object would contain a vtable pointer and thus 
wouldn't usually be a true constant (as well as wasting a bit of space 
in the binary). They are expanded into java.lang.String objects during 
class initialization, with the String sharing the actual data. The 
_Utf*'s are also private symbols, they are accessed via the class 
constant data.

>- How about __methods<n>? My guess would be that this is the data
>    used for Method.invoke() and Class.getMethods()...
>

Actually __MT_ (methods table) is the Class.methods[] field which 
contains the reflection data. I don't know what __methods is!

>I've got "hello world" fully statically linked, into 330kb (gzipped)
>using this technique. Previously it was around 1.5MB (gzipped).
>
Cool!

>Obviously the ideal long-term solution is to get libjava.so /
>libjava.dll to be a standard part of OS distros (just like libc), and
>to use Bryce's indirect dispatch work to allow binary compatability
>across library versions. Method-gc is just a short term hack until
>that happens (which sadly may be "never" on win32).
>
I don't think it would be unreasonable, in general, to expect developers 
to distribute a libgcj.dll or whatever with or alongside their 
applications, at least once binary compatibility is fully implemented. 
It isn't much different from an application which requires the visual 
basic DLLs or the msvcrt.dll or cygwin.dll. Still, this stuff sounds 
very useful for embedded and other specialized applications!

regards

Bryce.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]