Investigating the internal dependencies and structure of libgcj

This pages documents a fairly ad-hoc and empirical study I did of the internal structure of libgcj, with the aim of discovering if the objects it contains could easily be partitioned into a number of individual subgroups without cyclic interdependencies.

The motivation for this research was to try and make a shared libgcj DLL buildable on Windows and other PE object format platforms. The PE file format has an inherent limitation that restricts DLLs to 64k exported symbols, and libgcj has (at time of writing) 82717 exported data and function symbols. If a DLL is built with more than that number of symbols, those above the 64k ordinal are inacessible; any attempt by another executable or libaray to import them is redirected to the symbol at the equivalent ordinal modulo 64k. This causes the executable to crash, generally in very early startup.

At present, the only working option for Windows users is to resort to static compilation of all java applications. Unfortunately the PE binutils toolchain doesn't do garbage-collection, and with static libgcj weighing in at around 90 megabytes, applications often pull in many tens of megabytes of code and can easily take five minutes and a gigabyte of memory in the final link stage. So this makes a shared library (DLL) highly desirable, and that leads to a requirement not to exceed the maximum legal number of exports.

One option for implementation would be to divide libgcj into a number of inter-dependent sublibraries. When doing this it is necessary to avoid cyclic dependencies among the resulting libraries, as this is not something that many OS loaders are likely to be entirely happy with, and could lead to upredictable and/or hard-to-diagnose breakage. So this research was carried out to gain an understanding of the shape of libgcj's internal dependencies, in order to evaluate whether the problem of partitioning libgcj is soluble, and if it can sufficiently reduce the maximum number of exports from any individual shared library to below the 64k system limit.

Data-collection Methodology

The data about the internal structure of libgcj were gathered on an empirical, rather than theoretical basis; that is to say, by compiling libgcj and analysing the generated binary objects, rather than by (for example) static analysis of the code base. Neither of these is an entirely neutral procedure in the presence of native (C or C++) objects that may, unlike java objects, be compiled with variant sections by use of preprocessor features such as macro defines and conditionals. A static code analysis might end up over-recording dependencies, as for instance it might not know that two macro definitions were mutually exclusive and have to assume both may be present at once; the other side of that coin is that an empirical binary analysis might under-estimate dependencies that would be present under other conditions of compilation or on other platforms. Neither of these conditions will necessarily fatally undermine our conclusions, as we may be lucky enough that any excess or omitted dependencies do not materially affect any decision we can make on potential partitionings of the library.

After bootstrapping the compiler and libgcj, I scanned all the objects (PIC versions only) in the $objdir/$target/libjava directory tree using 'nm' and extracted separate lists of all the unresolved references (imports) and all the externally-visible symbols (exports) for each object, using scripts something like:

find . -name '*.o' | grep '/.libs/' > .pic.o.lst

# Usage:
while read x ;
  echo $x;
  nm $x | grep " [TRDU] " ;
done < .pic.o.lst > .syms.log

# Usage:
grep "^\.\| [DRT] " < .syms.log > .exports.log

# Usage:
grep "^\.\| [U] " < .syms.log > .imports.log

I processed the list of exports into a colon-separated format where each line contains objectfile:sym def line from nm output using this script:

while read x;
  case $x in
      echo "$obj_file:$x"
done < .exports.log > exports.log

Finally I ran this script, which reads the list of undefined references, noting which object's symbols it is processing as it goes, and greps the symbol in the list of exports to find out which object defines it. This information is written to stdout in the dot language used by the Graphviz graph visualisation software.


func_unique ()
  func_unique_result=`echo "$1" | tr ' ' '\012' | sort | uniq | tr '\012' ' '`

func_dump_deps ()
  for func_dump_deps_dep in $2 ;
    echo \""$1"\" '->' \""$func_dump_deps_dep"\"

echo "strict digraph libjava_deps {"

while read inputline;
  case $inputline in
      # flush out old ops.
      if test "x$deps_for_obj" != x ; then
        func_unique "$deps_for_obj"
        echo >&2 "Unique:$deps_for_obj"
        func_dump_deps "$obj_file" "$deps_for_obj"
      echo >&2
      echo >&2
      echo >&2 "New file $inputline"
      findsym=${inputline#U }
      greppit=`grep " [TRD] $findsym\$" exports.log`
      echo >&2 "import:$obj_file:$findsym"
      echo >&2 -n "export...:$greppit: "
      if test -z "$greppit" ;
        echo >&2 "Not found" ;
        echo >&2 "Found in $foundinfile"
        deps_for_obj+=" "$foundinfile" "
        echo >&2 "deps:$deps_for_obj:" ;
done < .imports.log

# flush out old ops.
if test "x$deps_for_obj" != x ; then
  func_unique "$deps_for_obj"
  echo >&2 "Unique:$deps_for_obj"
  func_dump_deps "$obj_file" "$deps_for_obj"
echo >&2
echo >&2

echo "}"

The final result of this chain of scripts is the .dot file containing raw interdependency data for all the binary objects in libgcj which can be seen here.

Data Processing

The file is too dense to easily be processed directly. Attempts to render it with the Graphviz 'dot' command-line utility run seemingly forever, hopelessly attempting to resolve around a hundred-thousand edge crossings. Graphviz includes a utility called sccmap, which extracts strongly-connected components from a graph in dot format and emits a new file grouping each component as a subgraph and supplying an overall map of the relation between components. The resulting file is much simplified, and contains a very obvious "hub" named "cluster_48". For rendering, this file was manually extracted into two separate dot files, one a top-level overview derived from the component map, and one a map of the internal structure of the "cluster_48" component.

Processed Data Analysis

The highlevel overview makes it immediately clear that what I had hoped is indeed the case. There is a very clear partition indicated by cluster_48, which is the only cluster on its rank in the graph.

libjava-highlevel-view.png Danger! Huge (43259x827x16) and potentially browser-crashing .PNG file. In case of problems, disable images in your browser, use the "Download" link on the attachment page, and view in an external viewer or editor; or alternatively, try the PDF version below.

libjava-scc-top.pdf The toplevel view as a PDF, spread across numerous horizontally-contiguous pages (which appear to have been rendered into reverse order in the PDF file for reasons as yet unknown...)

The internal map of cluster_48 is attached just for interest.

libjava-big-ball-of-hair-cluster_48.png Danger! Huge (4593x7163x256) and potentially browser-crashing .PNG file. In case of problems, disable images in your browser, use the "Download" link on the attachment page, and view in an external viewer or editor; or alternatively, try the PDF version below.

cluster_48.pdf PDF version of the above, on one huge page.


The graph clearly pivots around cluster 48, and by taking the object files that were listed in cluster 48 and all the clusters ranked below it on the overview graph as one partition, we can see by visual inspection we have only incoming dependencies to the group, and therefore it is a self-contained partition that can be split off from the whole of libgcj and stand along, without causing unresolved references or circular dependencies. Referring back to the original -scc .dot file, this turns out to be all clusters from 0 up to 48 plus 92 and 93. Filtering only these object files into libgcj in fact resulted in a few unresolved references and a couple of iterations were required to trim the exact set of files to be included, presumably as a result of minor inaccuracies introduced when sccmap partitioned the graph into strongly-connected components. The outcome was this set of object files, which form an interlinked whole with no unresolved dependencies:

        $(propertyo_files) \
        gnu-CORBA.lo \
        gnu-java-awt-dnd-peer-gtk.lo \
        gnu-java-awt-peer-gtk.lo \
        gnu-java-awt-peer-swing.lo \
        gnu-java-lang-management.lo \
        gnu-javax-management.lo \
        gnu-javax-rmi.lo \
        gnu-javax-sound-midi.lo \
        gnu-xml-aelfred2.lo \
        gnu-xml-dom.lo \
        gnu-xml-libxmlj.lo \
        gnu-xml-pipeline.lo \
        gnu-xml-stream.lo \
        gnu-xml-transform.lo \
        gnu-xml-util.lo \
        gnu-xml-validation.lo \
        gnu-xml-xpath.lo \
        java-lang-management.lo \
        javax-imageio.lo \
        javax-rmi.lo \
        jni-libjvm.lo \
        org-omg-CORBA.lo \
        org-omg-CORBA_2_3.lo \
        org-omg-CosNaming.lo \
        org-omg-Dynamic.lo \
        org-omg-DynamicAny.lo \
        org-omg-IOP.lo \
        org-omg-Messaging.lo \
        org-omg-PortableInterceptor.lo \
        org-omg-PortableServer.lo \
        org-omg-SendingContext.lo \
        org-omg-stub.lo \
        org-relaxng.lo \
        org-xml.lo \
        META-INF/services/ \
        META-INF/services/java.util.prefs.PreferencesFactory.lo \
        META-INF/services/javax.sound.midi.spi.MidiDeviceProvider.lo \
        META-INF/services/javax.sound.midi.spi.MidiFileReader.lo \
        META-INF/services/javax.sound.midi.spi.MidiFileWriter.lo \
        META-INF/services/javax.sound.sampled.spi.AudioFileReader.lo \
        classpath/native/jni/classpath/jcl.lo \
        classpath/native/jni/classpath/jnilink.lo \
        classpath/native/jni/java-math/gnu_java_math_GMP.lo \
        classpath/tools/libgcj_tools_la-tools.lo \
        gnu/awt.lo \
        gnu/awt/j2d.lo \
        gnu/gcj/io.lo \
        gnu/gcj/io/natSimpleSHSStream.lo \
        gnu/gcj/io/shs.lo \
        gnu/gcj/tools/gcj_dbtool.lo \
        gnu/gcj/util/natDebug.lo \
        gnu/gcj/util/natGCInfo.lo \
        gnu/java/awt/dnd.lo \
        gnu/java/awt/font.lo \
        gnu/java/awt/image.lo \
        gnu/java/awt/print.lo \
        gnu/java/awt/font/autofit.lo \
        gnu/java/awt/font/opentype.lo \
        gnu/java/awt/font/opentype/truetype.lo \
        gnu/java/lang/management/natVMClassLoadingMXBeanImpl.lo \
        gnu/java/lang/management/natVMCompilationMXBeanImpl.lo \
        gnu/java/lang/management/natVMGarbageCollectorMXBeanImpl.lo \
        gnu/java/lang/management/natVMMemoryMXBeanImpl.lo \
        gnu/java/lang/management/natVMMemoryManagerMXBeanImpl.lo \
        gnu/java/lang/management/natVMMemoryPoolMXBeanImpl.lo \
        gnu/java/lang/management/natVMOperatingSystemMXBeanImpl.lo \
        gnu/java/lang/management/natVMRuntimeMXBeanImpl.lo \
        gnu/java/lang/management/natVMThreadMXBeanImpl.lo \
        gnu/java/net/local.lo \
        gnu/java/net/protocol/ftp.lo \
        gnu/java/net/protocol/gcjlib.lo \
        gnu/java/net/protocol/https.lo \
        gnu/javax/imageio.lo \
        gnu/javax/print.lo \
        gnu/javax/sound.lo \
        gnu/javax/activation/viewers.lo \
        gnu/javax/imageio/bmp.lo \
        gnu/javax/imageio/gif.lo \
        gnu/javax/imageio/jpeg.lo \
        gnu/javax/imageio/png.lo \
        gnu/javax/naming/giop.lo \
        gnu/javax/naming/ictxImpl/trans.lo \
        gnu/javax/naming/jndi/url/corbaname.lo \
        gnu/javax/naming/jndi/url/rmi.lo \
        gnu/javax/print/ipp.lo \
        gnu/javax/print/ipp/attribute.lo \
        gnu/javax/print/ipp/attribute/defaults.lo \
        gnu/javax/print/ipp/attribute/job.lo \
        gnu/javax/print/ipp/attribute/printer.lo \
        gnu/javax/print/ipp/attribute/supported.lo \
        gnu/javax/security/auth/login.lo \
        gnu/javax/sound/sampled/AU.lo \
        gnu/javax/sound/sampled/WAV.lo \
        gnu/javax/swing/plaf/gnu.lo \
        gnu/javax/swing/plaf/metal.lo \
        java/sql.lo \
        java/awt/im.lo \
        java/awt/print.lo \
        java/awt/im/spi.lo \
        java/security/acl.lo \
        javax/activation.lo \
        javax/activity.lo \
        javax/management.lo \
        javax/naming.lo \
        javax/print.lo \
        javax/sql.lo \
        javax/tools.lo \
        javax/transaction.lo \
        javax/management/loading.lo \
        javax/management/openmbean.lo \
        javax/management/remote.lo \
        javax/management/remote/rmi.lo \
        javax/naming/directory.lo \
        javax/naming/event.lo \
        javax/naming/ldap.lo \
        javax/naming/spi.lo \
        javax/print/attribute.lo \
        javax/print/event.lo \
        javax/print/attribute/standard.lo \
        javax/security/cert.lo \
        javax/security/auth/kerberos.lo \
        javax/security/auth/login.lo \
        javax/security/auth/spi.lo \
        javax/sound/midi.lo \
        javax/sound/midi/spi.lo \
        javax/swing/plaf/multi.lo \
        javax/swing/plaf/synth.lo \
        javax/swing/text/rtf.lo \
        javax/swing/text/html/default.css.lo \
        javax/transaction/xa.lo \
        org/ietf/jgss.lo \

I am using these results to develop a patch that builds libgcj as two separate shared libtool libraries on Windows platforms, containing the root partition of the graph at cluster 48 and below, and containing the "above-ground" components from the overview; the only dependencies are from to

This technique has now successfully bootstrapped and created two valid DLLs; cyggcj-11.dll, exporting 56780 symbols, and cyggcj-noncore-11.dll, exporting 25937 symbols. By these terms, the experiment is a success, as merely building the DLLs successfully confirms that we have successfully produced a non-cyclic partitioning.

But wait! There's more!

The overhead of taking several minutes to statically link every executable against a ninety-something megabyte library archive during the testsuite really adds up, whereas linking against a library full of DLL import stubs is comparatively fast - a matter of seconds - and the extra load-time for the DLLs is tiny. The result is that the runtime for the testsuite is cut from around 10 hours (on an AMD64x2 5000+ CPU) to around 3 hours! A similarly pleasing result is obtained for the size of the executables, as seen in these before-and-after examples taken from the left-over executables in the $objdir/$target/libjava/testsuite directory: before ...

$ find . -name '*.exe' | xargs du -h
45M     ./.libs/TestEarlyGC.exe
45M     ./.libs/TestLeak.exe
45M     ./.libs/TestMultiple.exe
45M     ./.libs/TestParent.exe

... and after:

$ find . -name '*.exe' | xargs du -h
92K     ./.libs/TestEarlyGC.exe
52K     ./.libs/TestLeak.exe
48K     ./.libs/TestMultiple.exe
48K     ./.libs/TestParent.exe

On the down side, there are a couple of regressions yet to investigate, but no particular reason to suppose they couldn't be fixed. The testsuite results can be seen here for the unpatched compiler with static linking, and here with libgcj compiled as a pair of DLLs.

None: Internal_dependencies_of_libgcj (last edited 2009-08-22 13:17:51 by DaveK)