This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: preliminary html doc about locating regressions
- From: Janis Johnson <janis187 at us dot ibm dot com>
- To: Janis Johnson <janis187 at us dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org, phil at jaj dot com, rodrigc at attbi dot com, bangerth at ticam dot utexas dot edu, reichelt at igpm dot rwth-aachen dot de
- Date: Wed, 18 Dec 2002 11:48:40 -0800
- Subject: Re: preliminary html doc about locating regressions
- References: <20021213171631.A15524@us.ibm.com>
This is an update of my document about how to locate GCC regressions.
It's based on notes from Craig Rodrigues, feedback in mailing lists, and
my own recent experiences. I haven't yet figured out where to link it
from, but unless there are objections I'll check it in to wwwdocs/htdocs
as reghunt-howto.html. It passes validation at http::/validator.w3.org.
Janis Johnson
IBM Linux Technology Center, OzLabs North
--- empty Fri Dec 13 17:01:24 2002
+++ reghunt-howto.html Wed Dec 18 11:25:19 2002
@@ -0,0 +1,216 @@
+<html>
+
+<head>
+<title>How to Locate GCC Regressions</title>
+</head>
+
+<body>
+
+<h1>How to Locate GCC Regressions</h1>
+
+<p>A regression is a bug that did not exist in a previous release.
+Problem reports for GCC regressions have a very high priority, and we
+make every effort to fix them before the next release. Knowing which
+change caused a regression is valuable information to the developer
+who is fixing the problem, even if that patch merely exposed an existing
+bug.</p>
+
+<p>People who are familiar with building GCC but who don't have the
+knowledge of GCC internals to fix bugs can help a lot by identifying
+patches that caused regressions to occur. The same techniques can be
+used to identify the patch that unknowingly fixed a particular bug on
+the mainline when that bug also exists as a regression on a release
+branch, allowing someone to port the fix to the branch.</p>
+
+<p>These instructions assume that you are already familiar with building
+GCC on your platform.</p>
+
+<h2>Search strategies</h2>
+
+<p>If you've got sufficient disk space available, keep old install
+tree around for use in finding small windows in which regressions
+occur. Some people who do this regularly add information to GNATS
+about particular problem reports for regressions.</p>
+
+<p>Before you start your search, verify that you can reproduce the
+problem with GCC built from the current sources. If not, the bug might
+have been fixed, or it might not be relevant for your platform, or the
+failure might only happen with particular options. Next, verify that you
+get the expected behavior for the start and end dates of the range.</p>
+
+<p>The basic search strategy is to iterate through the following steps
+while the range is too large to investigate by hand:</p>
+
+<ul>
+<li><a href="#get_sources">Get GCC sources</a> for that date.</li>
+<li><a href="#build_gcc">Build GCC</a>, or specific components that are
+ needed for testing.</li>
+<li><a href="#run_test">Run the test</a>.</li>
+<li>Based on the outcome of the test, find the midpoint of the new
+ search range.</li>
+</ul>
+
+<p>The first three steps are described below. They can be automated,
+as can the framework for the binary search (see the script in
+<a href="http://gcc.gnu.org/ml/gcc/2002-12/msg01148.html">mail from
+Janis Johnson</a>, which might be added to <code>contrib/</code>).</p>
+
+<p>If you've narrowed down the dates sufficiently, it might be faster to
+give up on the binary search and start doing forward updates by small
+increments and then incremental builds rather than full builds. Whether
+this is worthwhile depends on the relative time it takes to update the
+sources, to do a build from scratch, and to do an incremental build.</p>
+
+<p>Eventually you'll need to <a href="#identify_patch">identify the patch</a>
+and verify that it causes the behavior of the test to change.</p>
+
+<h2><a name="get_sources">Get GCC sources</a></h2>
+
+<p>Get a local CVS tree using the <a href="cvs.html">cvs instructions</a>.
+Use a read-only tree that is separate from what you use for development or
+other testing, since it's easy to end up with files in strange states.</p>
+
+<p>You'll be checking out the local tree used for the regression search
+over and over again. If you've got enough disk space, either on the test
+system or on a machine to which it has fast access, it's much quicker to
+get a local copy of the GCC CVS repository using rsync by following the
+<a href="rsync.html">rsync instructions</a>. Besides being quicker, it
+doesn't affect other GCC developers who are using the real repository.</p>
+
+<h3>CVS mainline</h3>
+
+<p>Check out the GCC CVS tree, specifying the date you want to test. You
+can keep copies of the various ChangeLog files to compare later when you're
+ready to identify the patch that caused the regression. For example:</p>
+
+<pre>
+ cat <<EOF > cplog
+ #! /bin/sh
+ mkdir -p logs/`dirname ${1}`
+ cp ${1} logs/${1}.${2}
+ EOF
+ chmod +x cplog
+
+ DATE="2002-05-01 10:15"
+ LOGDATE="`echo ${DATE} | sed 's/[-: ]/_/g'`"
+ cvs co -D "${DATE}" gcc > log/${LOG_DATE}.log
+ find gcc -name ChangeLog -exec ./cplog {} ${LOG_DATE} \;
+</pre>
+
+<p>Don't keep copies of the ChangeLogs in your CVS tree itself; that
+will slow down new checkouts. Rather than keeping copies of the files,
+you can also get differences between ChangeLog files using</p>
+
+<pre>
+ cvs diff -D <i>date1</i> -D <i>date2</i> ChangeLog
+</pre>
+
+<p>When moving forward and doing incremental builds, use
+<code>contrib/gcc_update</code> rather than <code>cvs co</code> or
+<code>cvs update</code>.</p>
+
+<h3>CVS branches</h3>
+
+<p>CVS doesn't provide a straightforward way to check out a branch for a
+particular date, but this method seems to work. To get the first date
+to test, do:</p>
+
+<pre>
+ cvs co -r <i>branch</i>
+ cvs up -j <i>branch</i> -j <i>branch</i>:"<i>date</i>"
+</pre>
+
+<p>For additional dates do the following, which works even when the next
+date is earlier than the previous date:</p>
+
+<pre>
+ cvs up -j <i>branch</i>:"<i>prev_date</i>" \
+ -j <i>branch</i>:"<i>next_date</i>"
+</pre>
+
+<h2><a name="build_gcc">Build GCC</a></h2>
+
+<p>The kind of bug you are investigating will determine what kind of
+build is required for testing GCC on a particular date. In almost
+all cases you can do a simple <code>make</code> rather than <code>make
+bootstrap</code>, provided that you start with a recent version of
+<code>gcc</code> as the build compiler. When building a full compiler,
+enable only the language you'll need to test. If you're testing a bug
+in a library, you'll only need to build that library, provided you've
+already got a compatible version of the compiler to test it with. If
+there are dependencies between components, or if you don't know which
+component(s) affect the bug, you'll need to update and rebuild
+everything for the language.</p>
+
+<p>If you're chasing bugs that are known to be in <code>cc1plus</code>
+you can do the following after a normal configure:</p>
+
+<pre>
+ cd <i>objdir</i>
+ make all-libiberty
+ cd gcc
+ make cc1plus
+</pre>
+
+<p>This will build libiberty and <code>cc1plus</code>. When you have
+<code>cc1plus</code>, you can feed your source code snippet to it:</p>
+
+<pre>
+ cc1plus -quiet <i>testcase</i>.ii
+</pre>
+
+<h2><a name="run_test">Run the test</a></h2>
+
+<p>Assuming that there is a self-contained test for the bug, as there
+usually is for bugs reported via GNATS, write a small script to run it
+and to report whether it passed or failed. If you're automating your
+search then the script should tell you whether the next compiler build
+should use earlier or later GCC sources.</p>
+
+<p>Hints for coming up with a self-contained test is beyond the scope
+of this document.</p>
+
+<h2><a name="identify_patch">Identify the patch</a></h2>
+
+<p>Differences in the ChangeLog files will let you identify files that
+have changed. If it's a small enough set you can guess which patch
+might have caused the regression and update only the files changed
+by that patch. Remember to look at all ChangeLogs that might list
+relevant changes, not just the obvious ones.</p>
+
+<p>The following CVS commands can help you identify changes from one
+version of a file to another:</p>
+
+<ul>
+<li><code>cvs diff -D <i>date1</i> -D <i>date2</i> <i>file</i></code></li>
+<li><code>cvs log -N <i>file</i></code></li>
+<li><code>cvs log -N -d"<i>date1</i><<i>date2</i>" <i>file</i>
+ </code></li>
+<li><code>cvs annotate <i>file</i></code></li>
+</ul>
+
+<p>When you've identified the likely patch out of a set of patches
+between the current low and high dates of the range, test a source tree
+from just before or just after that patch was added and then add or
+remove the patch by updating only the affected files. You can do this by
+identifying the revision of each file when the patch was added and then
+using <code>ccs update -r<i>rev</i> <i>file</i></code> to get the desired
+version of each of those files. Build and test to verify that this
+patch changes the behavior of the test.</p>
+
+<h2><a name="problems">Problems</a></h2>
+
+<p>If one of the test builds fails, try a date or time slightly earlier or
+later and see if that works. Usually all files in a patch are checked in
+at the same time, but if there was a gap you might have hit it.</p>
+
+<p>Sometimes regressions are introduced during a period when bootstraps
+are broken on the platform, particularly if that platform is not tested
+regularly. Your best bet here is to find out whether the regression
+also occurs on a platform where bootstraps were working at that time.</p>
+
+<p>If a regression occurs at the time of a large merge from a branch,
+search the branch.</p>
+
+</body>
+</html>