This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, classpath] speed up gen-classlist.sh


[ I wonder why classpath-patches moderators don't put posters in the
  default-allowed list; just one additional click avoids the need to
  re-moderate each time; at least that's what we do for libtool lists ]


This patch speeds up gen-classlist.sh, on my system from 2.32s to 0.54s
(best of three, warm cache), thus 'make all' in classpath/lib.

The script used a loop:
  for omit; do
     grep -v $omit $list1 >$list2
     mv $list2 $list1
  done

which is bad for three reasons, in that order: it scales in the number
of omissions times the number of list items, uses regular expressions,
and forks a lot.  A series of greps, as suggested in a FIXME note, will
not solve the scaling, BTW.

However, as most omissions are not regular expressions but plain file
names, we can use an awk hash (similar to what Autoconf 2.62 uses):

  BEGIN {
    omit[""] = 1
    omit["literal/file.java"] = 1
    ...
  }
  {
    if (omit[$3]) next
    if ($3 ~ /file\/regex.*.java$/) next
    ...
    print
  }

to drop all omissions at once, and thus fix all three issues.

Note that old Solaris awk does not understand
  re="regex"
  if ($3 ~ re)

which is why the regular expressions are expanded inline in the main
part of the script.  Unlike "for (key in hash)", it doesn't like
  if ($3 in omit)

either, but apart from filling the array, using
  if (omit[$3])

works just as well (and cost a few ms more only).

This slightly changes the semantics of the *.omit files:
- literal files name detection is heuristic,
- EREs are used instead of BREs.

A casual search of the GCC tree did not turn up any omission files for
which it would matter, though (and if it did, we could anyway fix either
the script to translate the BREs, or the files).

AFAICS classpath so far did not depend on @AWK@ (except for tags rules).
Should I enable the AC_PROG_AWK in configure.ac, even though
AM_INIT_AUTOMAKE already pulls it in?

Aside, I assume it is on purpose that the genclasses rule in
classpath/lib/Makefile.am is run upon each make invocation, no?

Tested i686-unknown-linux-gnu, and tried the awk script on Solaris 2.6.
OK for trunk?

Thanks for reading this far,
Ralf

libjava/classpath/ChangeLog:
2008-03-24  Ralf Wildenhues  <Ralf.Wildenhues@gmx.de>

	* lib/gen-classlist.sh.in: Avoid grepping each omission, by
	building an awk script with a hash for literal files, and
	awk regular expressions for the rest.

diff --git a/libjava/classpath/lib/gen-classlist.sh.in b/libjava/classpath/lib/gen-classlist.sh.in
index 1c70411..1768c15 100755
--- a/libjava/classpath/lib/gen-classlist.sh.in
+++ b/libjava/classpath/lib/gen-classlist.sh.in
@@ -82,26 +82,48 @@ for dir in $vm_dirlist; do
    fi
 done
 
-# FIXME: could be more efficient by constructing a series of greps.
-for filexp in `cat tmp.omit`; do
-   grep -v ${filexp} < ${top_builddir}/lib/classes.1 > ${top_builddir}/lib/classes.tmp
-   mv ${top_builddir}/lib/classes.tmp ${top_builddir}/lib/classes.1
-done
+# Mangle the omit expressions into a script suitable for old awk.
+# Exploit the fact that many omissions are not regular expressions:
+# assume a single file is listed if it does not contain '*', '$',
+# and ends in '.java'.
+sed_omit_hash='
+1i\
+  BEGIN {\
+    omit[""] = 1
+$a\
+  }
+/[*$]/d
+/\.java$/!d
+s|^|    omit["|
+s|$|"] = 1|'
+sed_omit_main_loop='
+1i\
+  {\
+    if (omit[$3]) next
+$a\
+    print\
+  }
+/^[^*$]*\.java$/d
+s|/|\\/|g
+s|^|  if ($3 ~ /|
+s|$|/) next|'
 
+sed "$sed_omit_hash" <tmp.omit >tmp.awk
+sed "$sed_omit_main_loop" <tmp.omit >>tmp.awk
+@AWK@ -f tmp.awk < ${top_builddir}/lib/classes.1 > ${top_builddir}/lib/classes.tmp
+mv ${top_builddir}/lib/classes.tmp ${top_builddir}/lib/classes.1
 
+vm_omitlist=
 for dir in $vm_dirlist; do
    if test -f $dir/$1.omit; then
-      for filexp in `cat $dir/$1.omit`; do
-	 grep -v $filexp < vm.add > vm.add.1
-	 mv vm.add.1 vm.add
-      done
+      vm_omitlist="$vm_omitlist $dir/$1.omit"
    fi
 done
-cat vm.add >> classes.1
+cat $vm_omitlist | sed "$sed_omit_hash" > tmp.awk
+cat $vm_omitlist | sed "$sed_omit_main_loop" >> tmp.awk
+@AWK@ -f tmp.awk < vm.add >>${top_builddir}/lib/classes.1
 
-rm vm.omit
-rm vm.add
-rm tmp.omit
+rm -f vm.omit vm.add tmp.omit tmp.awk
 
 new=
 if test -f ${top_builddir}/lib/classes.2; then


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]