This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: compacting _Jv_Utf8Const


Tom Tromey wrote:

Per> $ nm libgcj.so|grep ' _Utf'|awk '{print $1}'>/tmp/bar
Per> $ uniq </tmp/bar|wc
Per>    54395   54395  489555

This doesn't measure duplication of contents, only symbol names.

Yes and no. It was meant to measure "post-duplicate-elimination". The 'print $1' selects out the *addresses*, not names. However, I forgot to sort by address, which can be done by passing -n to nm, or by adding an explicit 'sort'. If we do that, we get: 24533 24533 220797

So duplicate elimination is more effective than 10% - it better than
halves the number of symbols, from 60208 to 24533.  It would be
interesting to to see how much space is reduced - i.e. are shorter
symbols or longer symbols more likely to be duplicated?

If we compiled larger units (packages or libraries rather than
source files usually with a single class), then compile-time
duplicate elimination would be more effective, and link-time
duplicate elimination might no longer be worth it.

BTW I looked at the `nm | grep' output a bit closer and there are a
few symbols with the string `_Utf' in their names that aren't
Utf8Consts, eg:

_Z13_Jv_FindClassP13_Jv_Utf8ConstPN4java4lang11ClassLoaderE

grepping for ` _Utf' may be more robust.

Yes, I already did that. -- --Per Bothner per@bothner.com http://per.bothner.com/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]