This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[wwwdocs] Cleanup of "Open projects" 1/n


Hi,

This is a first step towards cleaning up our "Open projects" web
pages in projects/ from the "Work in progress" section.

Let's start with removing some obviously dead stuff:
- Bounded pointer checking "gcc.gnu.org/projects/bp/".  The
  line "Project Status (updated 2000-08-11)" should tell you
  enough.  My understanding is this work has been subsumed
  by mudflap, but even if it is not, this is Really Old Stuff
  that needs to go away.
- Value range propagation pass "which isn't yet in GCC" and
  never will be.  There is a newer vrp.c on the hammer branch
  and Diefo wants a VRP pass for tree-ssa.
- Automaton based pipeline hazard recognizer.  This is not
  work in progress, it's finished.

OK?

Gr.
Steven



Index: index.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/index.html,v
retrieving revision 1.45
diff -c -3 -p -r1.45 index.html
*** index.html	4 Jul 2004 20:13:53 -0000	1.45
--- index.html	6 Nov 2004 12:18:20 -0000
***************
*** 11,24 ****
  <li><a href="#projects_for_beginner_gcc_hackers">Projects for beginner GCC hackers</a></li>
  <li><a href="#projects_for_the_c_preprocessor">Projects for the C preprocessor</a></li>
  <li><a href="#projects_for_the_gcc_web_pages">Projects for the GCC web pages</a></li>
- <li><a href="#work_in_progress">Work in progress</a>
- <ul>
  <li><a href="#development_branches">Development Branches</a></li>
- <li><a href="#bounds_checking_with_bounded_pointers">Bounds Checking with Bounded Pointers</a></li>
- <li><a href="#value_range_propagation_pass">Value range propagation pass</a></li>
- <li><a href="#automaton_based_pipeline_hazard_recognizer">Automaton based pipeline hazard recognizer</a></li>
  <li><a href="#data_prefetch">Data prefetch support</a></li>
- </ul></li>
  <li><a href="#optimizer_inadequacies">Optimizer inadequacies</a></li>
  <li><a href="#ia64_projects">Projects to improve performance on IA-64</a></li>
  <li><a href="#changes_to_support_c99_standard">Changes to support C99 standard</a></li>
--- 11,18 ----
*************** href="cpplib.html">C preprocessor</a>.</
*** 95,124 ****
  <p>There is a separate projects list for the <a href="web.html">web
  pages</a>.</p>
  
! <h2><a name="work_in_progress">Work in progress</a></h2>
! <p>Different projects that are in progress.</p>
! 
! <h3><a name="development_branches">Development Branches</a></h3>
  <p>There are several <a href="../cvs.html#devbranches">development branches</a>
  pursuing various goals.</p>
  
! <h3><a name="bounds_checking_with_bounded_pointers">Bounds Checking with Bounded Pointers</a></h3>
! <p>There is a separate page for <a
! href="bp/main.html">Bounds Checking with Bounded Pointers</a>.</p>
! 
! <h3><a name="value_range_propagation_pass">Value range propagation pass</a></h3>
! <p>John Wehle (john@feith.com) implemented a <a
! href="http://gcc.gnu.org/ml/gcc-patches/2000-07/msg00968.html";>value
! range propagation pass</a> which isn't yet in GCC.</p>
! 
! <h3><a name="automaton_based_pipeline_hazard_recognizer">Automaton based pipeline hazard recognizer</a></h3> 
! <p>Vladimir Makarov has implemented an 
! <a href="http://gcc.gnu.org/ml/gcc-patches/2001-06/msg00951.html";>
! automaton based pipeline hazard recognizer</a>. It enables better
! instruction scheduling and it provides the base for a future software
! pipelining implementation.</p>
! 
! <h3><a name="data_prefetch">Data prefetch support</a></h3>
  <p>A separate page describes <a href="prefetch.html">
  data prefetch support and optimizations</a> that are in development
  in the main branch.</p>
--- 89,99 ----
  <p>There is a separate projects list for the <a href="web.html">web
  pages</a>.</p>
  
! <h2><a name="development_branches">Development Branches</a></h2>
  <p>There are several <a href="../cvs.html#devbranches">development branches</a>
  pursuing various goals.</p>
  
! <h2><a name="data_prefetch">Data prefetch support</a></h2>
  <p>A separate page describes <a href="prefetch.html">
  data prefetch support and optimizations</a> that are in development
  in the main branch.</p>
Index: bp/main.html
===================================================================
RCS file: bp/main.html
diff -N bp/main.html
*** bp/main.html	4 Feb 2004 08:06:37 -0000	1.19
--- /dev/null	1 Jan 1970 00:00:00 -0000
***************
*** 1,834 ****
- <html>
- 
- <head>
- <title>Bounds Checking in C &amp; C++ using Bounded Pointers</title>
- <link rev="made" href="mailto:greg@mcgary.org"; />
- </head>
- 
- <body>
- <h1>Bounds Checking Projects</h1>
- 
- <p>This page describes work in progress to add fine-grained bounds
- checking to GCC's C and C++ front-ends.  Interested parties are
- invited to port to Objective C as well.  Please contact <a
- href="mailto:greg@mcgary.org";>Greg McGary, greg@mcgary.org</a> if you
- wish to assist with development or testing.</p>
- 
- <h2>Contents</h2>
- 
- <ul>
- <li><a href="#overview">Overview of Bounded Pointers</a></li>
- <li><a href="#status">Project Status</a></li>
- <li><a href="#goals">Goals</a></li>
- <li><a href="#nongoals">Non-Goals</a></li>
- <li><a href="#maybegoals">Maybe Goals</a></li>
- <li><a href="#toolchain">Other Links in the Toolchain</a></li>
- <li><a href="#building">Building GCC and glibc for Bounded Pointers</a></li>
- <li><a href="#testing">Testing with Bounded Pointers</a></li>
- <li><a href="#testbpuse">Packages Tested Using Bounded Pointers</a></li>
- <li><a href="#knownbugs">Known Bugs</a></li>
- <li><a href="#porting">How to Port to a new CPU</a></li>
- <li><a href="#gccdetails">GCC Implementation Details</a></li>
- </ul>
- 
- <hr />
- 
- <h2><a name="overview">Overview</a></h2>
- 
- <p>Bounded Pointers are easy to understand.  GCC augments every
- pointer datum with two additional pointers that hold the low bound and
- high bound of the object to which the pointer is seated.  Prior to
- dereference, GCC generates code to test whether the pointer's value
- lies within the bounds, and if bounds are violated, to generate a
- machine exception.</p>
- 
- <p>Many find the notion of changing the size of a fundamental data
- type alarming, but for well-formed higher-level C code that uses
- accurate function prototypes and avoids abusing pointer/integer casts,
- this is seldom a problem in practice.  Even low-level code can use
- bounded pointers with some extra care.</p>
- 
- <h2><a name="status">Project Status (updated 2000-08-11)</a></h2>
- 
- <ul>
- 
- <li><h3>Working for Intel x86</h3>
- 
- <p>Basic functionality is present
- for Intel x86 using GCC code on the CVS tag
- ``<code>bounded-pointers-ss-20000730</code>''.  Basic functionality
- includes ...</p>
- 
-     <ul>
-     <li>synthesis of a datum's bounds upon application of
-     <code>addressof</code> operator.
-     (Bounded pointers are also returned by memory allocators such as
-     <code>malloc</code>, but that's implemented by the allocator library.)</li>
-     <li>propagation of bounds via pointer assignment,
-     passing of function argument and return of function value.</li>
-     <li>programmer control over boundedness of pointer types via new qualifiers
-     ``<code>__bounded</code>'' and ``<code>__unbounded</code>,'' and
-     access to the components of a bounded pointer via new prefix operators
-     ``<code>__ptrvalue</code>'', ``<code>__ptrlow</code>'' and
-     ``<code>__ptrhigh</code>''.</li>
-     </ul>
- 
- <p>I have tested BPs on a number of packages (see <a
- href="#testbpuse">Packages Tested Using Bounded Pointers</a> for
- details).  I have completed a full (mostly successful) bootstrap of
- GCC for <code>LANGUAGES=c</code> passing
- ``<code>-fbounded-pointers</code>'' at all stages (see <a
- href="#bootbpgcc">Bootstrap GCC with Full Bounds Checks</a> for
- details).</p>
- </li>
- 
- <li><h3>GNU C Library</h3>
- 
- <p>I have recently committed support for
- bounded pointers to the trunk of the GNU C library CVS tree.  Intel
- x86 is functional.  PowerPC is in progress but still incomplete.  90%
- of the <code>glibc</code> testsuite passes in BP mode for Intel x86.  C library
- support includes thunks for system calls that accept bounded pointer
- arguments, check their bounds and pass simple pointers on to the
- OS kernel.</p>
- </li>
- 
- <li><h3>Committing GCC Changes</h3>
- 
- <p>My primary focus is on getting GCC
- changes out of the branch and committed to the CVS trunk.</p>
- </li>
- 
- <li><h3>Documentation</h3>
- 
- <p>My secondary focus is on writing
- documentation which is necessary in order to guide developers and
- testers.</p>
- </li>
- 
- <li><h3>Unfinished Business</h3>
- <p>The most important unfinished bits are:</p>
- 
-     <ul>
-     <li>C++ front end.</li>
-     <li>Optimize to eliminate redundant bounds checks.</li>
-     <li>Relax GCC's requirement that structs reside in memory
-     so that elements of a bounded pointer may be independently
-     assigned to machine registers.</li>
-     <li>Port to more CPU architectures.  (PowerPC port is in progress.)
-     See <a href="#porting">How to Port to a new CPU</a>.</li>
-     </ul>
- </li>
- 
- </ul>
- 
- <h2><a name="goals">Goals</a></h2>
- 
- <ul>
- 
- <li><h3>Finest Granularity</h3>
- 
- <p>Bounded pointers enforce
- data-integrity at the finest possible granularity.  Once a pointer is
- seated to a datum, be it a scalar, array, array element, structure, or
- structure member, references through that pointer may not exceed the
- bounds of the datum.  Purify won't do this for you.  As long as a
- pointer references valid memory, purify won't protest that your
- program blew the bounds of an array and started overwriting an
- adjacent data structure.</p>
- </li>
- 
- <li><h3>Prevent Unwanted Mixing of Checked and Unchecked Code</h3>
- 
- <p>Functions having pointers in their return-type/arg-types signature are
- incompatible between the BP and non-BP modes.  In order to prevent
- unwanted mixing (i.e., calling a function in BP mode when it is
- defined in non-BP mode, and vice-versa), GCC ``mangles'' the symbols
- of all BP mode functions that have pointers in their signatures.
- The presence of BP-mangled symbols causes unwanted mixing to be
- detected at link time, rather than at runtime where the debug cost
- is very much higher.</p>
- 
- <p>As of this writing, the C function names are mangled by prepending
- ``<code>__BP_</code>''.  This is subject to change, since using a
- suffix might work better with gdb (see <a href="#toolchain">Other
- Links in the Toolchain</a>).  At this time, only function names are so
- mangled.  It would be better to also mangle the names of global data
- structures that contain pointers.</p>
- 
- <p>For C++, whose functions are already mangled, I intend to add a
- boundedness qualifier to the mangling scheme, perhaps adding the
- letter `X' after the `P' that indicates pointer.  C++ does not mangle
- the type of a function's return value, but in BP mode, this
- information is essential.  The calling convention for returning a
- bounded-pointer is incompatible with that for returning a single word.
- A bounded pointer is represented as a three-word struct, so returning
- one means returning a struct by value, which requires that the caller
- designate space for the return value and pass a hidden first argument
- that points to it.  The presence of the hidden pointer argument shifts
- the argument list by one slot, making it incompatible with the
- non-BP-return case.</p>
- </li>
- 
- <li><h3>Low Overhead</h3>
- 
- <p>Space and time overheads for
- bounded-pointer programs are both approx 150%..200% (i.e., 2.5x..3x
- slowdown and 2.5x..3x code size increase).  A couple years back I
- implemented bounded pointers in <code>gcc-2.7.2</code> with much hackage at
- the RTL layer, and using a special BP machine mode (akin to the complex-number
- machine modes) that allowed GCC to assign BP components individually
- to registers, and to pass/return BPs components in registers.  This
- version had space and time overhead of only 75%, and that was without
- any optimizations to eliminate redundant checks.</p>
- 
- <p>This experience leads me to believe that with optimizations to
- eliminate redundant bounds checks, and with the ability to assign BP
- components individually to registers, space and time overhead can be
- brought under 50% (i.e., 1.5x slowdown and 1.5x code size increase).</p>
- </li>
- 
- </ul>
- 
- <h2><a name="nongoals">Non-Goals</a></h2>
- 
- <p>Bounded pointers do not detect the following errors in memory-usage:</p>
- 
- <ul>
- <li>Memory Leaks</li>
- <li>References through Dangling Pointers</li>
- <li>References to Uninitialized Memory</li>
- </ul>
- 
- <p>Memory checks are done by Purify or <code>Checker</code> (Refer to
- GCC's <code>-fcheck-memory-usage</code> option).  The checks provided
- by bounded pointers and the memory-usage checkers complement each
- other nicely without overlap.</p>
- 
- <h2><a name="maybegoals">Maybe Goals</a></h2>
- 
- <ul>
- 
- <li><h3>Support Controlled Mixing of Checked and Unchecked Code</h3>
- 
- <p>Mixing checked and unchecked code is something that's theoretically
- possible using two mechanisms: (1) explicit qualification of the
- boundedness of declarations and (2) thunks that translate between
- bounded-pointer and unbounded-pointer function interfaces.</p>
- 
- <p>In practice, the amount of work to properly control mixing is
- unpredictable.  For instance, it's bloody difficult to build
- bounded-pointer applications of reasonable complexity with an
- unbounded-pointer C library.  On the other hand, it's considerably
- easier to mix bounded-pointer application code with unbounded-pointer
- X11 libraries.</p>
- 
- <p>I have implemented the beginnings of automatic thunk-generation in
- GCC, but so far it has only proven useful for building the C torture
- testsuite in the days before I had a BP-capable C library.</p>
- 
- <p>I consider this to be a back-burner project, since I believe that with
- proper optimization, a 100% bounded-pointer program can be built and
- run with acceptable space &amp; time overhead.  In the absence of a
- performance justification for mixing unchecked code, the other reason
- to mix unchecked code is because one has only binaries.  As a
- free-software project, bounded pointers in GCC exist primarily to
- benefit the free-software community, so I don't intend to go out of my
- way to accommodate programs that can't be built entirely from source
- code.</p>
- 
- <p>A third reason to mix unchecked code might be to work in stages on
- converting a large system to become bounded-pointer capable.  It would
- be nice to provide this option, but other things are more important
- for now, particularly optimizations and broadening the list of
- supported CPUs.</p>
- </li>
- 
- </ul>
- 
- <h2><a name="toolchain">Other Links in the Toolchain</a></h2>
- 
- <ul>
- 
- <li><h3>ld</h3>
- 
- <p>GCC synthesizes bounds with the
- <code>addressof</code> operation.  A data object declared as
- ``<code>extern</code>'' with an incomplete type (or with a structure type
- containing a flexible array member) has unknown size, but might have
- its address taken.  Since GCC can't compute the high bound based on an
- unknown size, it generates datum <code>foo</code>'s high bound as a
- reference to the synthetic symbol ``<code>foo.high_bound</code>''.  If
- <code>foo</code> is defined as initialized data, GCC generates the
- label definition of <code>foo.high_bound</code> immediately following
- <code>foo</code>'s initializers.  However, if <code>foo</code> resides
- in uninitialized data (BSS or common), GCC cannot do this, and it's
- left to the linker to synthesize <code>foo.high_bound</code>.
- I have a small patch to GNU ld that does this for ELF targets.
- (<a href="patch-ld.txt">Get the ld patch from here</a>)</p>
- </li>
- 
- <li><h3>gdb</h3>
- 
- <p>Bounded pointers introduce two nuisances for debugging:</p>
- 
- <p>First, bounded pointers are represented internally as three-member
- structures containing simple pointer members for the value, low bound
- and high bound.  Gdb currently knows nothing about bounded pointers
- and treats them according to the information in the symbol table.
- Print a pointer variable and you'll see a three member struct.
- Attempt to dereference a pointer variable via the expression
- ``<code>*foo</code>'', and you'll get an error because gdb thinks foo is
- a struct--you must dereference with ``<code>*foo.value</code>''.</p>
- 
- <p>Second, if a function has a pointers as any of its return type or
- argument types, its assembler-name is prefixed with
- ``<code>__BP_</code>''.  Therefore, you need to prefix such function
- names when setting breakpoints or printing function addresses.</p>
- 
- <p>It would be useful to teach gdb about these two idiosyncrasies of
- bounded pointers.</p>
- 
- <p>You will need a small patch to gdb so that it won't crash starting up
- on a BP-mode program.  (<a href="patch-gdb.txt">Get the gdb
- patch from here</a>)</p>
- </li>
- 
- <li><h3><a name="autoconf">autoconf</a></h3>
- 
- <p>The ``<code>__BP_</code>'' prefix that is applied to functions
- having pointers in their return-type/arg-types signature presents
- problems for autoconf.  Autoconf tests for the presence of library
- functions by creating a tiny test program that compiles and links with
- a library.  If the test program fails to link, then the function is
- considered to be absent from the library and the package supplies a
- substitute.  The declaration coded into the test program is a phony
- one of this form: ``<code>char foo ();</code>''.  If one wishes to
- configure with the GCC option ``<code>-fbounded-pointers</code>'', and
- <code>foo</code> has pointers in its signature, its library definition
- will be as ``<code>__BP_foo</code>'', but the phony declaration will
- compile as a reference to the simple ``<code>foo</code>'' and thus
- yield a false negative.  A work-around is to always configure with the
- non-BP version of a library.  I hope that a long-term solution will
- come with extensions to autoconf that arrange to get a prototype for
- the function under test.</p>
- </li>
- 
- </ul>
- 
- <h2><a name="building">Building GCC and glibc for Bounded Pointers</a></h2>
- 
- <p>If you wish to help with development and/or testing, you must first
- build a baseline.  In the examples below, the shell variables
- ``<code>$..._dir</code>'' represent the directory names of your
- toplevel <code>gcc</code>, <code>glibc</code>, <code>ld</code> and
- <code>gdb</code> trees.  The shell variables
- ``<code>$..._repo</code>'' hold the names of the GCC and
- <code>glibc</code> CVS repositories.  The values of these repository
- variables will depend on whether you have write access or have
- readonly access through <code>pserver/anoncvs</code> mode.  I'll
- assume you know enough about CVS and about configuring and building
- GNU packages to adapt the procedure below to fit your environment.</p>
- 
- <ol>
- 
- <li><h3>Checkout, build and install GCC</h3>
- 
- <pre>
- $ mkdir -p $gcc_dir/BUILD
- $ cd $gcc_dir
- $ cvs -d $gcc_repo co -rbounded-pointers-ss-20000730 -d src gcc
- $ cd BUILD
- $ ../src/configure --prefix=$gcc_dir --enable-languages=c
- $ make bootstrap
- $ make install
- </pre>
- 
- <p>For convenience, you might wish to install a symlink called
- ``<code>gcc-bp</code>'' in one of your bin directories that refers to
- <code>$gcc_dir/bin/gcc</code>.</p>
- </li>
- 
- <li><h3>Checkout, build and install glibc</h3>
- 
- <pre>
- $ mkdir -p $glibc_dir/BUILD
- $ cd $glibc_dir
- $ cvs -d $glibc_repo co -d src libc
- $ cd BUILD
- $ env CC=$gcc_dir/bin/gcc ../src/configure --prefix=$glibc_dir \
- 	--enable-bounded --disable-profile --disable-shared
- $ make
- $ make install
- </pre>
- 
- <p>I recommend ``<code>--disable-profile</code>'' and
- ``<code>--disable-shared</code>'' in order to shorten build time since
- you won't need these targets.</p>
- </li>
- 
- <li><h3>Obtain, patch, build and install GNU ld</h3>
- 
- <p>I won't give detailed instructions here, because there's nothing out
- of the ordinary.  Download a modern binutils release, or get the code
- from CVS.</p>
- 
- <p>You will need a small patch to GNU ld so that it will synthesize
- ``<code>foo.high_bound</code>'' symbols for common &amp; bss symbols.  (<a
- href="patch-ld.txt">Get the ld patch from here</a>) The patch
- is relative to <code>binutils-2.10</code>, but will work on
- <code>binutils-2.9</code> as well.</p>
- </li>
- 
- <li><h3>Obtain, patch, build and install gdb</h3>
- 
- <p>I won't give detailed instructions here, because there's nothing out
- of the ordinary.  Download a modern gdb release, or get the code from
- CVS.</p>
- 
- <p>You will need a small patch to gdb so that it won't crash starting up
- on a BP-mode program.  <a href="patch-gdb.txt">Get the gdb
- patch from here.</a> The patch is relative to <code>gdb-5.0</code>,
- but will work on <code>gdb-4.18</code> as well, if you supply the
- `<code>-l</code>' option to <code>patch</code> to make it more lenient
- about whitespace differences.</p>
- </li>
- 
- </ol>
- 
- <h2><a name="testing">Testing with Bounded Pointers</a></h2>
- 
- <p>Now that you have the essentials for working with bounded pointers,
- here are some suggestions for testing.  I present them in order of
- increasing difficulty.  You will be testing three things: (1)
- correctness of BP-mode code generated by GCC, the correctness of the C
- library's handling of BPs, and (3) correctness of the code under test.
- If you wish to focus on debugging the BP implementation in GCC and the
- GNU C library, then you should test using mature infrastructure
- packages that have been around for many years.  If you test on new
- code, you're more likely to find bugs in the application, which is of
- course what the BP feature is designed for, so you are most surely
- welcome to do that!</p>
- 
- <ul>
- 
- <li><h3>Run glibc's test suite</h3>
- 
- <p>This is easy.  Just run ``<code>make check</code>'' after building.
- Most tests pass.  As for the rest, pick one and debug it.</p>
- </li>
- 
- <li><h3>Run GCC's test suite for C</h3>
- 
- <p>Here's how to run the GCC C torture tests in BP mode:</p>
- 
- <pre>
- $ make check-gcc RUNTESTFLAGS="--tool_opts=\"-g -fbounded-pointers -static \
-                                -B$glibc_build_dir/csu/ -L$glibc_build_dir\""
- </pre>
- 
- <p>Remember that ``<code>$..._dir</code>'' variables represent directory
- names from your system.  Note the use of
- ``<code>-B$glibc_build_dir/csu/</code>'' to get <code>bcrt1.o</code>,
- and ``<code>-L$glibc_build_dir</code>'' to get libraries.  Both of
- these options refer to your C library build directories, not to the
- directories in which you installed the C library.  This is
- intentional.  The only thing you really need from the install tree is
- the header tree in ``<code>$glibc_dir/include</code>''.  For the rest,
- it is more convenient to get the files directly from the build tree,
- so that when you rebuild <code>glibc</code> after fixing a bug, you
- can avoid the install step.  Naturally, if you change a public header
- file, you'll need to do the install, but this happens much less
- frequently.</p>
- </li>
- 
- <li><h3>Build some other package and run its test suite</h3>
- 
- <p>Pick a favorite package and have at it.  Don't forget to build
- a BP version of any extra libraries the package requires.</p>
- 
- <p>Because of the <a href="#autoconf">problems with <code>autoconf</code></a>
- mentioned above, the best workaround is to configure with the static
- non-BP version of the C library you built alongside the BP version.
- Your installed C library will invariably be an older version of
- <code>glibc</code>, and will yield different configuration results, so
- you don't want to use it.</p>
- 
- <p>I use a couple of ``wrapper'' script to prefix the
- <code>configure</code> command that gives me a suitable environment
- for using the newly-built C library.</p>
- 
- <p>This one is called ``<code>ubpenv</code>'':</p>
- 
- <pre>
- #!/bin/bash
- export CC=$gcc_dir/bin/gcc
- export LDFLAGS="-static -B$glibc_build_dir/csu/ -L$glibc_build_dir"
- export CFLAGS="-isystem $glibc_dir/include -O2"
- "$@"
- </pre>
- 
- <p>This one is called ``<code>bpenv</code>'', and differs only in the
- value of <code>CFLAGS</code>:</p>
- 
- <pre>
- #!/bin/bash
- export CC=$gcc_dir/bin/gcc
- export LDFLAGS="-static -B$glibc_build_dir/csu/ -L$glibc_build_dir"
- export CFLAGS="-isystem $glibc_dir/include -fbounded-pointers \
-                -fno-optimize-sibling-calls -O2 -g"
- "$@"
- </pre>
- 
- <p>I recommend that you turn off sibling-call optimizations in order to
- preserve complete call traces and avoid surprises while debugging.</p>
- 
- <p>In order to override the configured value of CFLAGS, you need to build
- like so:</p>
- 
- <pre>
- $ bpenv eval make 'CFLAGS="$CFLAGS"'
- </pre>
- 
- <p>To save some typing, I have a third script called ``<code>bpmake</code>'':</p>
- 
- <pre>
- #!/bin/bash
- bpenv eval make 'CC="$CC"' 'CFLAGS="$CFLAGS"' 'LDFLAGS="$LDFLAGS"' "$@"
- </pre>
- 
- <p>With these scripts, the sequence for building and testing a GNU
- package in BP mode is this:</p>
- 
- <pre>
- $ ubpenv ./configure
- $ bpmake
- $ bpmake check
- </pre>
- </li>
- 
- <li><h3><a name="bootbpgcc">Bootstrap GCC with Full Bounds Checks</a></h3>
- 
- <p>The bootstrap procedure outlined below depends on already having a
- BP-capable compiler installed, and is performed on the GCC source tree
- at CVS tag ``<code>bounded-pointers-ss-20000730</code>''.  This procedure
- doesn't produce a GCC that's particularly useful, since it's so much
- slower.  This is purely a testing exercise in order to expose bounds
- violations in GCC, and to validate the correctness of bounds-checked
- code.</p>
- 
- <p>The host compiler is <code>gcc-bp</code>, an ordinary unchecked
- program that produces a checked stage1.  The stage1 compiler is fully
- bounds checked, and so runs like a pig on quaaludes while producing
- the stage2 compiler.  The stage2 compiler is a companion pig on
- quaaludes that produces a third drugged pig.  We do the final binary
- compare on the second- and third-stage pigs, and use the third-stage
- pig to run the test suite.</p>
- 
- <p>There are some potholes along the road that you'll need to steer around:</p>
- 
-     <ul>
-     <li> <code>makeinfo</code>, <code>install-info</code>, and
-     <code>texindex</code> don't link for lack of a BP version of
-     <code>libz.a</code>.  We don't need <code>texinfo</code>, so
-     we can just ignore it.</li>
-     <li> GCC's <code>gettext</code> implementation in
-     <code>gcc/intl/libintl.a</code> conflicts with
-     <code>glibc</code>'s, so we must configure to ignore GCC's.</li>
-     </ul>
- 
- <p>First, you must supplement the command-line in the
- <code>bpmake</code> script with these extra arguments:</p>
- 
- <pre>
- 'BOOT_CFLAGS="$CFLAGS"' 'BOOT_LDFLAGS="$LDFLAGS"'
- 'SYSTEM_HEADER_DIR="$glibc_dir/include"'
- </pre>
- 
- <p>With that done, this procedure does the trick:</p>
- 
- <pre>
- $ ubpenv ./configure --without-included-gettext --enable-languages=c
- $ bpmake all-libiberty
- $ bpmake -C gcc
- $ bpmake -C gcc stage1 bootstrap2
- </pre>
- 
- <p>The second and third stages compare cleanly.  Unfortunately,
- running the test suite yields these extra failures that did not
- appear for the installed <code>gcc-bp</code>:</p>
- 
- <pre>
- FAIL: gcc.c-torture/compile/981001-2.c,  -O0
- FAIL: gcc.c-torture/compile/981001-2.c,  -O1
- FAIL: gcc.c-torture/compile/981001-2.c,  -O2
- FAIL: gcc.c-torture/compile/981001-2.c,  -O3 -fomit-frame-pointer
- FAIL: gcc.c-torture/compile/981001-2.c,  -O3 -g
- FAIL: gcc.c-torture/compile/981001-2.c,  -O3 -fssa
- FAIL: gcc.c-torture/compile/981001-2.c,  -Os
- FAIL: gcc.c-torture/execute/990117-1.c execution,  -O3 -fomit-frame-pointer
- FAIL: gcc.c-torture/execute/990117-1.c execution,  -O3 -g
- FAIL: gcc.c-torture/execute/990117-1.c execution,  -O3 -fssa
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O1
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O2
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O3 -fomit-frame-pointer
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O3 -fomit-frame-pointer -funroll-loops
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O3 -g
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -O3 -fssa
- FAIL: gcc.c-torture/execute/ieee/minuszero.c execution,  -Os
- FAIL: gcc.dg/20000419-2.c (test for excess errors)
- FAIL: alias-1.c
- FAIL: wkali-1.c
- FAIL: wkali-2a.o
- FAIL: gcc.misc-tests/gcov-1.c (test for excess errors)
- WARNING: gcc.misc-tests/gcov-1.c compilation failed to produce executable
- FAIL: gcov-1.c:1:is 4:should be 11
- FAIL: gcov-1.c:1:is 5:should be 10
- FAIL: gcov-1.c:1:is 7:should be 1
- FAIL: gcc.misc-tests/gcov-2.c (test for excess errors) (PRMS 8294)
- WARNING: gcc.misc-tests/gcov-2.c compilation failed to produce executable
- </pre>
- 
- <p>Even so, it's not so bad for an intoxicated pig.</p>
- </li>
- 
- </ul>
- 
- <h2><a name="testbpuse">Packages Tested Using Bounded Pointers (updated 2000-08-09)</a></h2>
- 
- <p>Below is a list of results for packages tested with bounds checking.
- Unless otherwise noted, tests were done by me (Greg).</p>
- 
- <ul>
- 
- <li><h3>GNU awk 3.0.61</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>regex.c</code>: <a href="patch-regex.txt">This patch
-     was required.</a></li>
-     </ul>
- 
- <p>100% of the test suite passes after fixing the bugs listed above.
- (However, the maintainer admits that the test suite is hardly
- comprehensive.)  Fixes appear in 3.0.6.
- </p>
- </li>
- 
- <li><h3>GNU textutils 2.0g</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>regex.c</code>: <a href="patch-regex.txt">This patch
-     was required.</a></li>
-     <li> <code>pr</code>: <code>init_header</code> wrote past the end
-     of a string buffer for pages with column-width less than 22 characters.</li>
-     <li> <code>pr</code>: <code>store_columns</code> read a column descriptor
-     that was one past the end of the array of columns.</li>
-     <li> <code>tail</code>: <code>pipe_lines</code> read past the
-     beginning of a string buffer when given an empty input file.</li>
-     </ul>
- 
- <p>100% of the test suite passes after fixing the bugs listed above.
- Fixes appear in 2.0h</p>
- </li>
- 
- <li><h3>GNU shellutils 2.0k</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>regex.c</code>: <a href="patch-regex.txt">This patch
-     was required.</a></li>
-     </ul>
- 
- <p>100% of the test suite passes after fixing the bugs listed above.</p>
- </li>
- 
- <li><h3>GNU make 3.79.1</h3>
- 
- <p>No bounds violations exposed.  100% of the test suite passes.</p>
- </li>
- 
- <li><h3>GNU findutils 4.1.5</h3>
- 
- <p>No bounds violations exposed.  100% of the test suite passes.</p>
- </li>
- 
- <li><h3>web2c 7.3.2 (TeX and Metafont)</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>fixwrites</code>: <code>main</code> read past the
-     beginning of a string buffer when presented with an empty line.</li>
-     </ul>
- 
- <p>100% of the test suite passes after fixing the bug listed above
- and compiling with ``<code>gcc ... -fno-strict-aliasing</code>''.
- A strict-aliasing bug for i586 and i686 caused two assertion failures
- in <code>kpathsea</code>.</p>
- </li>
- 
- <li><h3>GNU binutils 2.10</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>bfd/archive.c</code>: Many calls to sprintf
-     for filling fields of <code>struct ar_hdr</code> write
-     a NUL-terminator one beyond the end of the field.</li>
-     <li> <code>ld/ldlang.c</code>: Missing prototype for
-     <code>walk_wild.c</code> caused callback pointer to be erroneously
-     treated as bounded.  GCC should be fixed to handle this, since
-     the non-prototype definition of <code>walk_wild</code> appeared
-     before its use.</li>
-     </ul>
- 
- <p>100% of the test suite passes after fixing the bug listed above.</p>
- </li>
- 
- <li><h3>GCC at CVS tag <code>bounded-pointers-ss-20000730</code></h3>
- 
- <p>Preliminary results are described at <a href="#bootbpgcc">Bootstrap
- GCC with Full Bounds Checks</a>.  Later, I'll turn the bounds-checked
- gcc loose on a recent release, such as <code>gcc-2.95.2</code> and
- see if any bounds violations occur.</p>
- </li>
- 
- <li><h3>GNU fileutils 4.0y</h3>
- 
- <p>Bounds violations exposed:</p>
-     <ul>
-     <li> <code>regex.c</code>: <a href="patch-regex.txt">This patch
-     was required.</a></li>
-     </ul>
- 
- <p>Testing and fixing is in progress...</p>
- </li>
- 
- <li><h3>GNU id-utils 3.2d</h3>
- 
- <p>Bounds violations exposed:</p>
- 
-     <ul>
-     <li> <code>regex.c</code>: <a href="patch-regex.txt">This patch
-     was required.</a></li>
-     <li> <code>mkid</code>: <code>assert_hits</code> read past the
-     beginning of an array.</li>
-     </ul>
- 
- <p>Testing and fixing is in progress...</p>
- </li>
- 
- </ul>
- 
- <h2><a name="knownbugs">Known Bugs</a></h2>
- 
- <p>Here is a list of bugs known to exist for bounded pointer mode in
- GCC and in the GNU C library, as well as some commonly found problems
- in applications:</p>
- 
- <ul>
- 
- <li><p>GCC generates bad bounds-checking code causing spurious bounds
- violations in <code>nss_parse_service_list</code>, which is used
- internally by the GNU C library's name-service switch.  The cause is
- unknown.</p></li>
- 
- <li><p>GCC generates bad bounds-checking code causing spurious bounds
- violations in <code>canonicalize</code>, which is used internally by
- the GNU C library's character-set conversion code.  The cause is
- unknown.</p></li>
- 
- <li><p>Programs that use their own version of GNU <code>regex.c</code>
- are missing some special handling for BPs in
- <code>EXTEND_BUFFER</code>.  <a href="patch-regex.txt">Get the
- patch for <code>regex.c</code> from here.</a></p></li>
- 
- <li><p>Threaded applications with <code>linuxthreads</code> are
- unusable in BP mode.  So far, <code>gdb</code> has been useless for
- debugging these, so this will take some time and head-scratching to
- resolve.</p></li>
- 
- </ul>
- 
- <h2><a name="porting">How to Port to a new CPU</a></h2>
- 
- <p>Most of the bounded pointers implementation is machine independent,
- both in GCC and in the C library.  These are the machine-dependent
- parts:</p>
- 
- <ul>
- 
- <li><h3>Conditional Traps in GCC Machine Description</h3>
- 
- <p>Bounded pointers depend on conditional trap patterns being defined
- in the machine description.  Some machine descriptions already have
- them, namely SPARC, rs6000 (PowerPC), and m68k.  All of
- these have machine instructions that implement conditional traps with
- one instruction.  Beginning with ISA-II, MIPS has conditional trap
- instructions as well, but its GCC machine description so far lacks
- them.  Intel x86 has no conditional trap instructions, but I defined
- conditional trap patterns that expand to primitive instructions to
- test and conditionally jump around an ``<code>int 5</code>''
- instruction.  If the CPU you wish to support has no conditional trap
- instructions, you should define pseudo conditional traps as I have
- done for x86.</p>
- 
- <p>Conditional traps are important for the sake of optimization.
- Without them, GCC would need to emit conditional branches as RTL,
- whose presence would artificially partition basic blocks and inhibit
- other optimizations.  Also conditional trap RTL expressions are
- readily identifiable and thus more conveniently checked for
- redundancies that can be eliminated.</p>
- </li>
- 
- <li><h3>Assembler Language Functions in GNU C Library</h3>
- 
- <p>Most CPUs on which the GNU C library runs define some functions in
- assembler which have pointers in their signatures.  Some are coded in
- assembler because they are performance critical, such as the memory
- and string functions (<code>bcopy</code>, <code>bzero</code>,
- <code>memcpy</code>, <code>memcmp</code>, <code>memset</code>,
- <code>strchr</code>, <code>strcpy</code>, <code>strcmp</code>,
- <code>strlen</code>, <code>strtok</code>, etc), primitives for
- multi-precision arithmetic (<code>add_n</code>, <code>addmul_1</code>,
- <code>mul_1</code>, <code>sub_n</code>, <code>submul_1</code>,
- <code>lshift</code>, <code>rshift</code>), and primitives for
- floating-point math (<code>frexp</code>, <code>frexpf</code>,
- <code>freexpl</code>, <code>remquo</code>, <code>remquof</code>,
- <code>remquol</code>, <code>sincos</code>, <code>sincosf</code>,
- <code>sincosl</code>).  Some are coded in assembler because they have
- special semantics that can't be achieved with plain C, namely
- <code>setjmp</code> and <code>longjmp</code>.  Finally, some have
- special interfaces to the kernel or C runtime, namely
- <code>brk</code>, <code>clone</code> and the startup code.</p>
- 
- <p>The assembler functions need to conditionally compile in BP and
- non-BP modes.  In BP mode, they must accommodate the calling
- convention where pointer arguments and return value are structs
- passed by value, and they must check the bounds of their arguments.
- The best way to proceed is to study what's already been done for Intel
- x86, a CISC target, and for PowerPC, a RISC target.</p>
- </li>
- 
- <li><h3>Startup Functions in GNU C Library</h3>
- 
- <p>Again, the best way to proceed is to study what's already been
- done for Intel x86 and (soon) for PowerPC.</p>
- </li>
- 
- </ul>
- 
- <h2><a name="gccdetails">GCC Implementation Details</a></h2>
- 
- <p>Sorry, nothing yet...  This stuff properly belongs in either the
- GCC manual or the GCC ``Internal Representation'' document.</p>
- 
- <hr />
- 
- <address>Greg McGary,
- <a href="mailto:greg@mcgary.org";>greg@mcgary.org</a>
- </address>
- 
- </body>
- </html>
--- 0 ----
Index: bp/patch-gdb.txt
===================================================================
RCS file: bp/patch-gdb.txt
diff -N bp/patch-gdb.txt
*** bp/patch-gdb.txt	4 Aug 2000 00:51:29 -0000	1.1
--- /dev/null	1 Jan 1970 00:00:00 -0000
***************
*** 1,21 ****
- diff -p -u partial-stab.h.~1~ partial-stab.h
- --- partial-stab.h.~1~	Tue Mar 28 10:44:53 2000
- +++ partial-stab.h	Mon Jul 31 16:59:25 2000
- @@ -401,7 +401,7 @@ switch (CUR_SYMBOL_TYPE)
-  	   function relative stabs, or the address of the function's
-  	   end for old style stabs.  */
-  	valu = CUR_SYMBOL_VALUE + last_function_start;
- -	if (pst->texthigh == 0 || valu > pst->texthigh)
- +	if (pst && (pst->texthigh == 0 || valu > pst->texthigh))
-  	  pst->texthigh = valu;
-  	break;
-        }
- @@ -647,7 +647,7 @@ switch (CUR_SYMBOL_TYPE)
-  	   use the address of this function as the low bound for
-  	   the partial symbol table.  */
-  	if (textlow_not_set
- -	    || (CUR_SYMBOL_VALUE < pst->textlow
- +	    || (pst && CUR_SYMBOL_VALUE < pst->textlow
-  		&& CUR_SYMBOL_VALUE
-  		!= ANOFFSET (objfile->section_offsets, SECT_OFF_TEXT)))
-  	  {
--- 0 ----
Index: bp/patch-ld.txt
===================================================================
RCS file: bp/patch-ld.txt
diff -N bp/patch-ld.txt
*** bp/patch-ld.txt	4 Aug 2000 00:51:29 -0000	1.1
--- /dev/null	1 Jan 1970 00:00:00 -0000
***************
*** 1,51 ****
- diff -p -u ldlang.c.~1~ ldlang.c
- --- ldlang.c.~1~	Mon Feb 21 05:01:27 2000
- +++ ldlang.c	Mon Jul 31 17:03:36 2000
- @@ -135,6 +135,7 @@ static void ignore_bfd_errors PARAMS ((c
-  static void lang_check PARAMS ((void));
-  static void lang_common PARAMS ((void));
-  static boolean lang_one_common PARAMS ((struct bfd_link_hash_entry *, PTR));
- +static void lang_one_common_high_bound PARAMS ((struct bfd_link_hash_entry *, bfd_vma));
-  static void lang_place_orphans PARAMS ((void));
-  static int topower PARAMS ((int));
-  static void lang_set_startof PARAMS ((void));
- @@ -3520,6 +3521,9 @@ lang_one_common (h, info)
-    h->u.def.section = section;
-    h->u.def.value = section->_cooked_size;
-  
- +  /* Synthesize a definition for <symname>.high_bound.  */
- +  lang_one_common_high_bound (h, size);
- +
-    /* Increase the size of the section.  */
-    section->_cooked_size += size;
-  
- @@ -3576,6 +3580,29 @@ lang_one_common (h, info)
-      }
-  
-    return true;
- +}
- +
- +/*
- +Synthesize a definition for the symbol <symname>.high_bound,
- +which might be needed if the user has enabled bounds checking.
- +*/
- +
- +static void
- +lang_one_common_high_bound (h, size)
- +      struct bfd_link_hash_entry *h;
- +      bfd_vma size;
- +{
- +  static char extsuff[] = ".high_bound";
- +  char *extname = xmalloc (sizeof (extsuff) + strlen (h->root.string));
- +  struct bfd_link_hash_entry *exth;
- +  sprintf (extname, "%s%s", h->root.string, extsuff);
- +  exth = bfd_link_hash_lookup (link_info.hash, extname, false, false, true);
- +  if (exth)
- +    {
- +      exth->type = bfd_link_hash_defined;
- +      exth->u.def.section = h->u.def.section;
- +      exth->u.def.value = h->u.def.value + size;
- +    }
-  }
-  
-  /*
--- 0 ----
Index: bp/patch-regex.txt
===================================================================
RCS file: bp/patch-regex.txt
diff -N bp/patch-regex.txt
*** bp/patch-regex.txt	4 Aug 2000 00:51:29 -0000	1.1
--- /dev/null	1 Jan 1970 00:00:00 -0000
***************
*** 1,53 ****
- diff -p -Bw -u regex.c.~1~ regex.c
- --- regex.c.~1~	Wed Jun  7 01:46:52 2000
- +++ regex.c	Thu Aug  3 14:30:00 2000
- @@ -1576,6 +1576,26 @@ static reg_errcode_t compile_range _RE_A
-     reset the pointers that pointed into the old block to point to the
-     correct places in the new one.  If extending the buffer results in it
-     being larger than MAX_BUF_SIZE, then flag memory exhausted.  */
- +#if __BOUNDED_POINTERS__
- +# define SET_HIGH_BOUND(P) (__ptrhigh (P) = __ptrlow (P) + bufp->allocated)
- +# define MOVE_BUFFER_POINTER(P) \
- +  (__ptrlow (P) += incr, SET_HIGH_BOUND (P), __ptrvalue (P) += incr)
- +# define ELSE_EXTEND_BUFFER_HIGH_BOUND		\
- +  else						\
- +    {						\
- +      SET_HIGH_BOUND (b);			\
- +      SET_HIGH_BOUND (begalt);			\
- +      if (fixup_alt_jump)			\
- +	SET_HIGH_BOUND (fixup_alt_jump);	\
- +      if (laststart)				\
- +	SET_HIGH_BOUND (laststart);		\
- +      if (pending_exact)			\
- +	SET_HIGH_BOUND (pending_exact);		\
- +    }
- +#else
- +# define MOVE_BUFFER_POINTER(P) (P) += incr
- +# define ELSE_EXTEND_BUFFER_HIGH_BOUND
- +#endif
-  #define EXTEND_BUFFER()							\
-    do { 									\
-      unsigned char *old_buffer = bufp->buffer;				\
- @@ -1590,15 +1610,17 @@ static reg_errcode_t compile_range _RE_A
-      /* If the buffer moved, move all the pointers into it.  */		\
-      if (old_buffer != bufp->buffer)					\
-        {									\
- -        b = (b - old_buffer) + bufp->buffer;				\
- -        begalt = (begalt - old_buffer) + bufp->buffer;			\
- +	int incr = bufp->buffer - old_buffer;				\
- +	MOVE_BUFFER_POINTER (b);					\
- +	MOVE_BUFFER_POINTER (begalt);					\
-          if (fixup_alt_jump)						\
- -          fixup_alt_jump = (fixup_alt_jump - old_buffer) + bufp->buffer;\
- +	  MOVE_BUFFER_POINTER (fixup_alt_jump);				\
-          if (laststart)							\
- -          laststart = (laststart - old_buffer) + bufp->buffer;		\
- +	  MOVE_BUFFER_POINTER (laststart);				\
-          if (pending_exact)						\
- -          pending_exact = (pending_exact - old_buffer) + bufp->buffer;	\
- +	  MOVE_BUFFER_POINTER (pending_exact);				\
-        }									\
- +    ELSE_EXTEND_BUFFER_HIGH_BOUND					\
-    } while (0)
-  
-  
--- 0 ----


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]