This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, classpath] speed up gen-classlist.sh
- From: Ralf Wildenhues <Ralf dot Wildenhues at gmx dot de>
- To: gcc-patches at gcc dot gnu dot org, classpath-patches at gnu dot org
- Date: Mon, 24 Mar 2008 18:11:06 +0100
- Subject: [PATCH, classpath] speed up gen-classlist.sh
[ I wonder why classpath-patches moderators don't put posters in the
default-allowed list; just one additional click avoids the need to
re-moderate each time; at least that's what we do for libtool lists ]
This patch speeds up gen-classlist.sh, on my system from 2.32s to 0.54s
(best of three, warm cache), thus 'make all' in classpath/lib.
The script used a loop:
for omit; do
grep -v $omit $list1 >$list2
mv $list2 $list1
done
which is bad for three reasons, in that order: it scales in the number
of omissions times the number of list items, uses regular expressions,
and forks a lot. A series of greps, as suggested in a FIXME note, will
not solve the scaling, BTW.
However, as most omissions are not regular expressions but plain file
names, we can use an awk hash (similar to what Autoconf 2.62 uses):
BEGIN {
omit[""] = 1
omit["literal/file.java"] = 1
...
}
{
if (omit[$3]) next
if ($3 ~ /file\/regex.*.java$/) next
...
print
}
to drop all omissions at once, and thus fix all three issues.
Note that old Solaris awk does not understand
re="regex"
if ($3 ~ re)
which is why the regular expressions are expanded inline in the main
part of the script. Unlike "for (key in hash)", it doesn't like
if ($3 in omit)
either, but apart from filling the array, using
if (omit[$3])
works just as well (and cost a few ms more only).
This slightly changes the semantics of the *.omit files:
- literal files name detection is heuristic,
- EREs are used instead of BREs.
A casual search of the GCC tree did not turn up any omission files for
which it would matter, though (and if it did, we could anyway fix either
the script to translate the BREs, or the files).
AFAICS classpath so far did not depend on @AWK@ (except for tags rules).
Should I enable the AC_PROG_AWK in configure.ac, even though
AM_INIT_AUTOMAKE already pulls it in?
Aside, I assume it is on purpose that the genclasses rule in
classpath/lib/Makefile.am is run upon each make invocation, no?
Tested i686-unknown-linux-gnu, and tried the awk script on Solaris 2.6.
OK for trunk?
Thanks for reading this far,
Ralf
libjava/classpath/ChangeLog:
2008-03-24 Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
* lib/gen-classlist.sh.in: Avoid grepping each omission, by
building an awk script with a hash for literal files, and
awk regular expressions for the rest.
diff --git a/libjava/classpath/lib/gen-classlist.sh.in b/libjava/classpath/lib/gen-classlist.sh.in
index 1c70411..1768c15 100755
--- a/libjava/classpath/lib/gen-classlist.sh.in
+++ b/libjava/classpath/lib/gen-classlist.sh.in
@@ -82,26 +82,48 @@ for dir in $vm_dirlist; do
fi
done
-# FIXME: could be more efficient by constructing a series of greps.
-for filexp in `cat tmp.omit`; do
- grep -v ${filexp} < ${top_builddir}/lib/classes.1 > ${top_builddir}/lib/classes.tmp
- mv ${top_builddir}/lib/classes.tmp ${top_builddir}/lib/classes.1
-done
+# Mangle the omit expressions into a script suitable for old awk.
+# Exploit the fact that many omissions are not regular expressions:
+# assume a single file is listed if it does not contain '*', '$',
+# and ends in '.java'.
+sed_omit_hash='
+1i\
+ BEGIN {\
+ omit[""] = 1
+$a\
+ }
+/[*$]/d
+/\.java$/!d
+s|^| omit["|
+s|$|"] = 1|'
+sed_omit_main_loop='
+1i\
+ {\
+ if (omit[$3]) next
+$a\
+ print\
+ }
+/^[^*$]*\.java$/d
+s|/|\\/|g
+s|^| if ($3 ~ /|
+s|$|/) next|'
+sed "$sed_omit_hash" <tmp.omit >tmp.awk
+sed "$sed_omit_main_loop" <tmp.omit >>tmp.awk
+@AWK@ -f tmp.awk < ${top_builddir}/lib/classes.1 > ${top_builddir}/lib/classes.tmp
+mv ${top_builddir}/lib/classes.tmp ${top_builddir}/lib/classes.1
+vm_omitlist=
for dir in $vm_dirlist; do
if test -f $dir/$1.omit; then
- for filexp in `cat $dir/$1.omit`; do
- grep -v $filexp < vm.add > vm.add.1
- mv vm.add.1 vm.add
- done
+ vm_omitlist="$vm_omitlist $dir/$1.omit"
fi
done
-cat vm.add >> classes.1
+cat $vm_omitlist | sed "$sed_omit_hash" > tmp.awk
+cat $vm_omitlist | sed "$sed_omit_main_loop" >> tmp.awk
+@AWK@ -f tmp.awk < vm.add >>${top_builddir}/lib/classes.1
-rm vm.omit
-rm vm.add
-rm tmp.omit
+rm -f vm.omit vm.add tmp.omit tmp.awk
new=
if test -f ${top_builddir}/lib/classes.2; then