This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Comments on mudflap

From: "Doug Graham" <dgraham at nortelnetworks dot com>
To: gcc at gcc dot gnu dot org
Date: Mon, 7 Jun 2004 14:07:08 -0400
Subject: Comments on mudflap

Frank Eigler suggested that I send this to the list.  His comments
follow mine.

I've been experimenting with mudflap from the gcc-3.5-20040530 snapshot,
and I've got a couple of comments that you might be interested in.
I've previously been using Richard Jones' bounds checking patches (the
ones currently maintained by Herman ten Brugge), with some tweaks that
I've added to make them work with Vxworks.  However, these don't support
C++, so I was looking into using mudflap instead. 

As far as I can tell after reading over your paper from the GCC 2003
summit, mudflap is very similar in operation to the Jones bounds-checker,
at least in its runtime behaviour.  Both make calls to a runtime library
to register/unregister objects, and both make calls to a check routine to
validate potentially dangerous operations.  The Jones patches appear to
make a lot more check calls to the runtime library, because they check
all pointer arithmetic, which I don't think mudflap currently does.
Mudflap also has an inline lookup cache to reduce to the number of calls
required to the runtime; Jones doesn't have this.

Those latter two factors ought to mean that mudflap adds less runtime
overhead than the Jones patches, but I'm finding the opposite on the test
program that I'm using for benchmarking.  The program is a STABS parser,
and I'm running it on about 30MB of STABS information.  Here's the output
of time(1) for various runs.  The test programs are statically linked
and compiled with "-O2 -pg".

gcc3.5:                      0.54s user 0.03s system 100% cpu  0.569 total
gcc3.5   -fmudflap:         39.56s user 0.56s system  99% cpu 40.438 total
gcc3.2.1 -fbounds-checking:  8.76s user 0.17s system 100% cpu  8.930 total

Gprof shows that the mudflap run spent 60% of its time in 565 calls to
__mf_age_tree (excluding recursive calls), and indeed, setting -age-tree
to a very large number causes total CPU usage to drop from 40s to 30s.
Another 25% of the CPU was spent in __mf_find_objects_rec.

Based on these numbers, it looks to me as though the splay tree scheme
used in the Jones patches is quite a bit more efficient than the scheme
used by mudflap, at least for this particular benchmark.

The mudflap -collect-stats option prints this:

  calls to __mf_check: 7055903 rot: 3814640/895326
         __mf_register: 2478597 [32774B, 17537153B, 57253B, 15945517B, 47484B]
         __mf_unregister: 1832560 [15996862B]
         __mf_violation: [0, 0, 0, 0, 1]
  calls with reentrancy: 748613
  lookup cache slots used: 1024  unused: 0  peak-reuse: 330646
  number of live objects: 646038
          zombie objects: 204

The bounds-checking statistics look like this:

  Calls to push, pop, param function:        4849919, 4849918, 2
  Calls to add, delete stack:                6355484, 6355481
  Calls to add, delete heap:                 697272, 51349
  Calls to check pointer +/- integer:        2430879
  Calls to check array references:           4064610
  Calls to check pointer differences:        911088
  Calls to check object references:          84148982
  Calls to check component references:       3851614
  Calls to check truth, falsity of pointers: 3970940, 509519
  Calls to check <, >, <=, >= of pointers:   0
  Calls to check ==, != of pointers:         7943034
  Calls to check p++, ++p, p--, --p:         38543835, 1368, 0, 0
  References to unchecked static, stack:     0, 0

If I'm reading this right, this means that there were about 7M
calls to the mudflap check routine __mf_check, about 80M calls to the
bounds-checking object dereference checker, and about 40m calls to the
bounds-checking pointer-increment checker.  So there are about 20 times
as many calls to the bounds-checking runtime, yet the mudflap runtime
used about 5 times as much CPU.

One other minor thing.  On the Fedora Core 1 Linux system that I'm
using for testing (with GLIBC 2.3.2), I get a mudflap violation every
time I use one of the ctype routines (isdigit, isupper, islower, etc.)
I assume that's because there is some array in the library that is
accessed by the ctype macros, but which hasn't been registered with
the mudflap runtime.  For the tests above, I've been using my own ctype
replacement so as to avoid this problem.

Regards,
Doug.

--------------------

From: "Frank Ch. Eigler" <fche@redhat.com>

Hi, Doug -

Thanks for your comments.  Please feel free to resend your note and my
reply to gcc@gcc.gnu.org for the benefit of other developers.

Indeed, the libmudflap runtime is not well optimized yet, and help or
suggestions are welcome.  On the instrumentation side at least, mudflap
is bound to perform better, and also has more room to improve.

With respect to ctype, yes, these types of static arrays used by
libc macros pose a problem.  One solution would require adding a few
autoconf-sensitive lines to mf-runtime.c to register these during
initialization much as environment variables and std* FILE objects are.

- FChE

Follow-Ups:
- Re: Comments on mudflap
  - From: Eyal Lebedinsky

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]