Patch to enable libgcj.dll for MinGW

TJ Laurenzo tlaurenzo@gmail.com
Thu Sep 8 09:10:00 GMT 2005


Ok.  I figured it out.  This is an interaction between objects
produced with g++ and gcj.  My original recommendation to take this
optimization out still holds as a result of this test on the grounds
that what we are asking the compilers to do is not equivilent to
simple inlining in a pure C++ program.  Further, I see no clear way to
make gcj output the "correct" code without some pretty nasty hacks.

The following C++ code produces an object file that would be roughly
equivilent to having gcj compile a class containing a getter that can
be decompiled to an inline method in C++:

myclass.cc - C++ equivilent of a simple java class
-------------------
class myclass 
{
private:
int someField;
public:
virtual int getSomeField();
virtual int compute();	// Native
};


int myclass::getSomeField()
{
  return someField;
}
===end===

myclass.h - gcjh-like header for myclass
-----------------
#pragma interface

class myclass 
{
private:
int someField;
public:
virtual int getSomeField()
{
  return someField;
}
virtual int compute();
};
===end===

natmyclass.cc - CNI-like implementation of a native method in myclass
------------------
#pragma implementation "myclass.h"

#include "myclass.h"

int myclass::compute()
{
	return getSomeField()+1;
}

// Throw in a main method so that we can build an executable
int main(int argc, char** argv)
{
myclass inst;
return inst.compute();
}
===end===

This may look right and it even works on systems that implement
inlined virtual methods with weak linkage.  But there is a subtle
problem.  From the "Java" class point of view (myclass.cc), there is
no need to emit a declaration with vague linkage.  From the C++
implementation point of view, the compiler knows that every source
file that includes a definition for the given class will have
duplicate definitions for getSomeField because it has an inline
declaration.  The compiler therefore emits a declaration with vague
linkage.

Now, this happens to work fine on Linux ld and the like because what
we have is a declaration with strong linkage held up to one with weak
linkage.  The linker discards the one with weak linkage and all is
well.  However on Windows, what we have is one declaration defined as
a COMDAT and one as a standard text definition.  To the linker these
are distinct.  The only way it would merge the two would be if both
were defined as a COMDAT, as would be the case if this had been a
standard C++ program where myclass.cc included the myclass.h header
along with its inline definitions.

In fact, compiling this on mingw using the CVS HEAD compiler produces
the following:
-----------------------
M:\tmp\comdatinline\win32>g++ -c -o myclass.o myclass.cc
M:\tmp\comdatinline\win32>g++ -c -o natmyclass.o natmyclass.cc
M:\tmp\comdatinline\win32>g++ -o test.exe myclass.o natmyclass.o
natmyclass.o(.text$_ZN7myclass12getSomeFieldEv[myclass::getSomeField()]+0x0):natmyclass.cc:
multiple definition of `myclass::getSomeField()'
myclass.o(.text+0x0):myclass.cc: first defined here
collect2: ld returned 1 exit status
===end===

Which is exactly what happens to natLogger.cc when building libjava. 
So, can we just ditch this entire optimization as my first patch did? 
Even on Linux, gcjh is causing the compilers to produce non-standard
code.  It just so happens that it is benign in that case whereas it is
a show-stopper for PE targets that use COMDAT to implement vague
linkage.

TJ



More information about the Java mailing list