This is the mail archive of the
mailing list for the GCC project.
Comments on mudflap
- From: "Doug Graham" <dgraham at nortelnetworks dot com>
- To: gcc at gcc dot gnu dot org
- Date: Mon, 7 Jun 2004 14:07:08 -0400
- Subject: Comments on mudflap
Frank Eigler suggested that I send this to the list. His comments
I've been experimenting with mudflap from the gcc-3.5-20040530 snapshot,
and I've got a couple of comments that you might be interested in.
I've previously been using Richard Jones' bounds checking patches (the
ones currently maintained by Herman ten Brugge), with some tweaks that
I've added to make them work with Vxworks. However, these don't support
C++, so I was looking into using mudflap instead.
As far as I can tell after reading over your paper from the GCC 2003
summit, mudflap is very similar in operation to the Jones bounds-checker,
at least in its runtime behaviour. Both make calls to a runtime library
to register/unregister objects, and both make calls to a check routine to
validate potentially dangerous operations. The Jones patches appear to
make a lot more check calls to the runtime library, because they check
all pointer arithmetic, which I don't think mudflap currently does.
Mudflap also has an inline lookup cache to reduce to the number of calls
required to the runtime; Jones doesn't have this.
Those latter two factors ought to mean that mudflap adds less runtime
overhead than the Jones patches, but I'm finding the opposite on the test
program that I'm using for benchmarking. The program is a STABS parser,
and I'm running it on about 30MB of STABS information. Here's the output
of time(1) for various runs. The test programs are statically linked
and compiled with "-O2 -pg".
gcc3.5: 0.54s user 0.03s system 100% cpu 0.569 total
gcc3.5 -fmudflap: 39.56s user 0.56s system 99% cpu 40.438 total
gcc3.2.1 -fbounds-checking: 8.76s user 0.17s system 100% cpu 8.930 total
Gprof shows that the mudflap run spent 60% of its time in 565 calls to
__mf_age_tree (excluding recursive calls), and indeed, setting -age-tree
to a very large number causes total CPU usage to drop from 40s to 30s.
Another 25% of the CPU was spent in __mf_find_objects_rec.
Based on these numbers, it looks to me as though the splay tree scheme
used in the Jones patches is quite a bit more efficient than the scheme
used by mudflap, at least for this particular benchmark.
The mudflap -collect-stats option prints this:
calls to __mf_check: 7055903 rot: 3814640/895326
__mf_register: 2478597 [32774B, 17537153B, 57253B, 15945517B, 47484B]
__mf_unregister: 1832560 [15996862B]
__mf_violation: [0, 0, 0, 0, 1]
calls with reentrancy: 748613
lookup cache slots used: 1024 unused: 0 peak-reuse: 330646
number of live objects: 646038
zombie objects: 204
The bounds-checking statistics look like this:
Calls to push, pop, param function: 4849919, 4849918, 2
Calls to add, delete stack: 6355484, 6355481
Calls to add, delete heap: 697272, 51349
Calls to check pointer +/- integer: 2430879
Calls to check array references: 4064610
Calls to check pointer differences: 911088
Calls to check object references: 84148982
Calls to check component references: 3851614
Calls to check truth, falsity of pointers: 3970940, 509519
Calls to check <, >, <=, >= of pointers: 0
Calls to check ==, != of pointers: 7943034
Calls to check p++, ++p, p--, --p: 38543835, 1368, 0, 0
References to unchecked static, stack: 0, 0
If I'm reading this right, this means that there were about 7M
calls to the mudflap check routine __mf_check, about 80M calls to the
bounds-checking object dereference checker, and about 40m calls to the
bounds-checking pointer-increment checker. So there are about 20 times
as many calls to the bounds-checking runtime, yet the mudflap runtime
used about 5 times as much CPU.
One other minor thing. On the Fedora Core 1 Linux system that I'm
using for testing (with GLIBC 2.3.2), I get a mudflap violation every
time I use one of the ctype routines (isdigit, isupper, islower, etc.)
I assume that's because there is some array in the library that is
accessed by the ctype macros, but which hasn't been registered with
the mudflap runtime. For the tests above, I've been using my own ctype
replacement so as to avoid this problem.
From: "Frank Ch. Eigler" <firstname.lastname@example.org>
Hi, Doug -
Thanks for your comments. Please feel free to resend your note and my
reply to email@example.com for the benefit of other developers.
Indeed, the libmudflap runtime is not well optimized yet, and help or
suggestions are welcome. On the instrumentation side at least, mudflap
is bound to perform better, and also has more room to improve.
With respect to ctype, yes, these types of static arrays used by
libc macros pose a problem. One solution would require adding a few
autoconf-sensitive lines to mf-runtime.c to register these during
initialization much as environment variables and std* FILE objects are.