compiling multiple source files at once

Per Bothner per@bothner.com
Mon Feb 5 18:09:00 GMT 2001


I've written code so one can specify multiple source files on the gcj
command line, and still specify a -o output file for the combined
output.  (Note I am not planning on checking this in until after the
gcc 3.0 branch!)  For example:

$ gcj -g --CLASSPATH=../.. -c Access.java Attribute.java AttrContainer.java ObjectType.java ArrayType.java ClassType.java ConstantPool.java CpoolClass.java CpoolEntry.java CpoolValue1.java CpoolValue2.java CpoolNameAndType.java CpoolRef.java CpoolString.java CpoolUtf8.java Filter.java Location.java Field.java Label.java IfState.java TryState.java SwitchState.java Method.java CodeAttr.java CodeFragment.java ConstantValueAttr.java LineNumbersAttr.java LocalVarsAttr.java InnerClassesAttr.java MiscAttr.java PrimType.java Scope.java SourceFileAttr.java Type.java Variable.java VarEnumerator.java ZipArchive.java ZipMember.java ZipLoader.java ArrayClassLoader.java ClassFileInput.java ClassTypeWriter.java ExceptionsAttr.java dump.java -o gnu-bytecode.o -v 
$ gcj -o dump --main=gnu.bytecode.dump gnu-bytecode.o

The most obvious advantage of doing it this way is that it substantially
speeds up compilation.  (At least I would assume so - I don't have numbers.)
The reason is that many Java packages have major cross references, so
compiling A.java (or A.class) will usually require reading and analyzing
B.java and C.java (or B.class and C.class) as well.  So if we need to
compile classes A, B, and C, we might as well compile them all in one go.

Of course if only A.java is modified, then it would be more efficient
to not also have to recompile B.java and C.java as well.  However, the
savings of just re-compiling A.java is reduced because we still have
to read B.java and C.java.  So if you write your Makefiles to exploit
the new feature, you development re-build may be slightly slower, but
"production" builds should be much faster - which is of course the
correct tradeoff.  (Once could imagine a configure option to select
between the build styles.)

I mention this on the general gcc list because the same idea could also
be useful for compiling (say) large C++ projects.  Instead of compiling
A.cc, C.cc, amd C.cc separately, each of which includes large header files
A.h, B.h, C.h, and X.y, you would compile them also at once:
        g++ -c -o abc.o A.cc B.cc C.cc
We could could require "well-behaved" header files for this is, so
compiling A.cc, B.cc, and C.cc would be treated as essentially the same
as abc.cc:
        #include "A.cc"
        #include "B.cc"
        #include "C.cc"

except that some care might be needed to deal with static symbols (and
namespaces).  But doing this should substantially speed up compilations.
It also creates much small executables, since there would only be a
*single* copy in abc.o) of the declarations in the header files.

There are other advantages becsides compilation speed.  For Java, we
can add an extra flag specifying that an emtire "package" is being
compiled, which allows valuable optimizations.  In any case, we can
save space due to reduced duplication of things like string literals.

I also added support for:
        gcj -c -o foo.o @foo.list
where foo.list is a file containing a list of input file names.
This feature (which also exists in Sun's javac) is sometimes
convenient, especially is there may be problems with overlong
command lines.  The multiple-input-files-on-the-command-line
is essentially syntactic sugar for this feature.  I.e.
        gcj -c -o foo.o A.java B.java C.java 
is converted by jvspec to:
        gcj -c -o foo.o @/tmp/tmpfile
where /tmp/tmpfile contains "A.java\nB.java\bC.java\n".

I also had to fix some problems in jc1 so compiling multiple
input files could work.  (It had some partial but non-working support.)

I'll post patches later, after the branch, or if there is interest.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/


More information about the Java mailing list