This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: progress on method-gc / mangling questions
- From: Bryce McKinlay <bryce at waitaki dot otago dot ac dot nz>
- To: Adam Megacz <gcj at lists dot megacz dot com>
- Cc: java at gcc dot gnu dot org
- Date: Tue, 22 Jan 2002 14:10:34 +1300
- Subject: Re: progress on method-gc / mangling questions
- References: <86lmeruye6.fsf@megacz.com>
Adam Megacz wrote:
>Okay, questions first, then an explanation of why I'm asking them...
>
> - Is there any documentation on the java symbol mangling procedure?
> Are there any guarantees about its stability in future releases?
> (I would guess that there are no guarantees, but I could be wrong).
>
The C++ ABI mangling is documented at
http://www.codesourcery.com/cxx-abi/abi.html#mangling
Java uses a subset of this. The mangling of function names is not likely
to change, barring any bugs in our implementation. One special case is
that int[] is mangled as JArray<int>. It would be nice to have a
(de)mangler implementation in Java so let us know if you write one ;-)
>- If there is no documentation, can anybody tell me the meaning of
> these symbol prefixes? I was able to guess the others.
>
> __CT_
> __IF_
> __CD_
> __CT_
> __FL_
> __MT_
>
Note that these ones are private/local symbols that are part of the
Class data, I doubt you should ever refer to them directly.
>- My binaries have a lot of sections named __Utf<n> where <n> is
> some number. What are these sections? String constants?
>
I don't know why they would be in different sections. The __Utf<n>
symbols are Utf8Constants. These are Utf8 strings with hashcode and
length attached. We use these to store reflection data strings and also
string constants. We use Utf8Consts instead of pre-initialized String
objects because a string object would contain a vtable pointer and thus
wouldn't usually be a true constant (as well as wasting a bit of space
in the binary). They are expanded into java.lang.String objects during
class initialization, with the String sharing the actual data. The
_Utf*'s are also private symbols, they are accessed via the class
constant data.
>- How about __methods<n>? My guess would be that this is the data
> used for Method.invoke() and Class.getMethods()...
>
Actually __MT_ (methods table) is the Class.methods[] field which
contains the reflection data. I don't know what __methods is!
>I've got "hello world" fully statically linked, into 330kb (gzipped)
>using this technique. Previously it was around 1.5MB (gzipped).
>
Cool!
>Obviously the ideal long-term solution is to get libjava.so /
>libjava.dll to be a standard part of OS distros (just like libc), and
>to use Bryce's indirect dispatch work to allow binary compatability
>across library versions. Method-gc is just a short term hack until
>that happens (which sadly may be "never" on win32).
>
I don't think it would be unreasonable, in general, to expect developers
to distribute a libgcj.dll or whatever with or alongside their
applications, at least once binary compatibility is fully implemented.
It isn't much different from an application which requires the visual
basic DLLs or the msvcrt.dll or cygwin.dll. Still, this stuff sounds
very useful for embedded and other specialized applications!
regards
Bryce.