This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
Re: compacting _Jv_Utf8Const
Tom Tromey wrote:
Per> $ nm libgcj.so|grep ' _Utf'|awk '{print $1}'>/tmp/bar
Per> $ uniq </tmp/bar|wc
Per> 54395 54395 489555
This doesn't measure duplication of contents, only symbol names.
Yes and no. It was meant to measure "post-duplicate-elimination".
The 'print $1' selects out the *addresses*, not names. However,
I forgot to sort by address, which can be done by passing
-n to nm, or by adding an explicit 'sort'. If we do that, we get:
24533 24533 220797
So duplicate elimination is more effective than 10% - it better than
halves the number of symbols, from 60208 to 24533. It would be
interesting to to see how much space is reduced - i.e. are shorter
symbols or longer symbols more likely to be duplicated?
If we compiled larger units (packages or libraries rather than
source files usually with a single class), then compile-time
duplicate elimination would be more effective, and link-time
duplicate elimination might no longer be worth it.
BTW I looked at the `nm | grep' output a bit closer and there are a
few symbols with the string `_Utf' in their names that aren't
Utf8Consts, eg:
_Z13_Jv_FindClassP13_Jv_Utf8ConstPN4java4lang11ClassLoaderE
grepping for ` _Utf' may be more robust.
Yes, I already did that.
--
--Per Bothner
per@bothner.com http://per.bothner.com/