This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

suggestions for GCC 3.2 release criteria


While waiting for lots of builds and short benchmarks I've been thinking
about the GCC 3.1 release criteria and how they might be improved upon
for 3.2.  My suggestions are presented in the form of a patch for the
initial 3.2 release criteria document, but the following is meant to be
a set of suggestions that can be evaluated separately; I merely found it
easier to put them together this way.  Mark and Gerald are, of course,
free to do whatever they want with this.

Janis

--- /dev/null	Tue May 23 09:27:54 2000
+++ criteria_3.2.html	Fri May 17 16:03:30 2002
@@ -0,0 +1,416 @@
+<html>
+
+<head>
+<title>GCC 3.2 Release Criteria</title>
+</head>
+
+<body>
+
+<h1>GCC 3.2 Release Criteria</h1>
+
+<p>This page enumerates the GCC 3.2 release criteria and provides a
+rationale for some of the choices that were made in their determination.
+These criteria will be the basis of the release manager's decision of
+whether GCC 3.2 is ready to be released.</p>
+
+<p>These criteria are intended to represent the minimum functionality
+and level of quality that are required in order to make the release.
+Because the development of GCC is largely dependent upon volunteers,
+however, the release manager might eventually need to decide whether to
+make a release even if the criteria listed here are not met.  The
+release manager might also decide to delay a release for unforeseen
+reasons that are not specifically covered by the release criteria.</p>
+
+<h2>Bug Fixes</h2>
+
+<p>In our mainline development branch we strive to fix all open
+problem reports (PRs) in our bug tracking system.  On a release branch
+our focus is on fixing any regressions from the previous release so that
+each release is better than the one before.</p>
+
+<p>PRs that represent regressions with respect to a previous release
+are marked as high-priority items, and every GCC maintainer
+with CVS and GNATS write access may set PRs indicating a regression to
+high-priority.</p>
+
+
+<h2>Platform Support</h2>
+
+<p>GCC is available on a vast number of platforms.  However, it is not
+possible to effectively test GCC in all possible configurations.
+Therefore, a small number of platforms have been selected as primary
+targets for determining whether GCC 3.2 is ready for release.  The
+targets chosen represent both the most popular operating systems and
+the most popular processors for GCC use.</p>
+
+<p><em>(Is the list of primary targets the same as the list mentioned
+in <code>develop.html</code> for targets "which the Steering Committee
+considers to be important"?
+For the secondary targets we talk about how the work must be
+done by volunteers; how is that different for the primary targets?
+What happens if there is a primary target with no designated tester
+and bugfixer?)</em></p>
+
+<table align="center" border="1">
+<caption>Primary Evaluation Platforms</caption>
+<tr><th>Chip</th><th>OS</th><th>Triplet</th>
+    <th>Tester</th><th>Bugfixer</th>
+</tr>
+<tr><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td>
+    <td>&nbsp;</td><td>&nbsp;</td>
+</tr>
+</table>
+
+<p>We have also identified a number of secondary evaluation targets.
+GCC's performance on the secondary targets will not be required to
+meet all the the criteria that the primary targets must meet in order
+for GCC 3.2 to ship, but these systems will be of considerable
+interest and it is likely that serious problems on one of these
+targets will delay the release, particularly  if there is a volunteer
+to fix bugs on that target.</p>
+
+<p>Among the secondary evaluation platforms, we are are especially
+concerned about free systems (i.e., GNU/Linux and the BSDs) where GCC
+also serves as the system compiler.</p>
+
+<table align="center" border="1">
+<caption>Secondary Evaluation Platforms</caption>
+<tr><th>Chip</th><th>OS</th><th>Triplet</th>
+    <th>Tester</th><th>Bugfixer</th>
+</tr>
+<tr><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td>
+    <td>&nbsp;</td><td>&nbsp;</td>
+</tr>
+</table>
+
+<p>Volunteers will be required, both to test and to fix bugs, for all
+primary and secondary platforms.  These volunteers may be the same person,
+but volunteers should be careful not to sign up for more work than they
+can actually do.  If volunteers cannot be found for a secondary platform
+then it will be dropped from this list.  <em>(Can people add a platform
+to this list by volunteering to be a tester for it?  If not, how is
+this list determined?)</em></p>
+
+<p>The bug-fixing volunteer for a primary target will commit to
+<em>(what?)</em></p>
+
+<p>The bug-fixing volunteer for a secondary target will commit to
+ensuring that GCC 3.2 will at least bootstrap itself on that target.
+That commitment doesn't necessarily mean fixing bugs personally; for
+example, if you are a manager for a company with GCC expertise you
+could be the volunteer if you'll commit to donating your employee's
+efforts as necessary.  The release manager, and the GCC development
+team, will make reasonable efforts to assist these volunteers by
+answering questions and reviewing patches as time permits.</p>
+
+<p>The lead tester for a target will commit to the following:</p>
+<ul>
+<li>Perform or oversee regular testing on the target, including
+bootstrap, regression testing, required integration testing, and
+required performance testing.  The work can be shared by multiple
+people, but one of them will be designated the lead tester and will
+coordinate the work.
+Ideally, &quot;regular&quot; means at least weekly starting from when
+the branch is made until the release, plus official prereleases.</li>
+<li>Report new regression test failures.</li>
+<li>File problem reports for bugs uncovered by integration testing
+and determine whether they are regressions from previous releases.</li>
+<li>Test patches for fixes to bugs they have reported.</li>
+</ul>
+
+<p>There is currently no process in place to keep track of the testing
+status on each platform.  If a volunteer steps forward to set up and
+maintain such a process then each lead tester will be expected to
+provide current status as requested by the release manager.</p>
+
+<h2>Language Support</h2>
+
+<p>There are GCC front-ends for many different languages.  However, in
+this release, only the behavior of front-ends for the following
+languages will be considered part of the release criteria:</p>
+<ul>
+<li>C</li>
+<li>C++</li>
+<li>Java</li>
+<li>Fortran</li>
+</ul>
+
+<p>The following languages will be supported by the release, but their
+behavior will not be a primary consideration in determining whether or
+not to ship a particular release candidate:</p>
+
+<ul>
+<li>Ada</li>
+<li>Objective-C</li>
+</ul>
+
+<p>In particular, no application testing, code quality, or compile-time
+performance testing will be required for these languages.  However,
+the regression testing criteria documented below will apply to these
+languages.</p>
+
+<h2>Regression Tests</h2>
+
+<p>The GCC testsuite contains extensive C and C++ regression tests, as
+well as some Fortran, Java, and Objective-C tests.  GCC 3.2 will not fail
+any of these tests which the previous release GCC passed on any of the
+supported platforms.  In particular, the current regression testsuite
+will be run using GCC 3.1.x, GCC 3.0.4, and GCC 2.95.3 on each of the
+supported
+platforms; those results can then be compared with the output from a
+release candidate.
+Because there have often been issues with generating PIC code, we will
+test with <code>-fPIC</code> as well.
+<em>(Encourage using Mauve tests for Java.)</em></p>
+
+<p>In addition, on all supported platforms, there will be no
+<code>--enable-checking</code> failures when running any of the
+regression test-suites.</p>
+
+<h2>Additional Tests</h2>
+
+<p>Compliance with the following criteria is required for all primary
+evaluation targets for the GCC 3.2 release.  Testers for secondary
+evaluation targets are also encouraged to perform these tests and
+report problems.</p>
+
+<h3>Applications</h3>
+
+<p>It is important that the compiler is verified on real-world
+applications.  The following applications represent a mix of low-level
+and high-level code, of numerical and logical programs, and of
+different programming languages.</p>
+
+<p>The required integration tests must be run on each of the primary
+and secondary targets.</p>
+
+<table align="center" border="1">
+<caption>Required Integration Tests</caption>
+<tr><th>Name</th>
+    <th>Language</th>
+    <th>Version</th>
+    <th>Source URL</th>
+    <th>Build and test guide</th>
+</tr>
+<tr><td><a href="http://www.gnu.org/software/emacs/";>GNU Emacs</a></td>
+    <td>C</td>
+    <td>20.6</td>
+    <td>&nbsp;</td>
+    <td>&nbsp;</td>
+</tr>
+<tr><td><a href="http://www.netlib.org/lapack/index.html";>LAPACK</a></td>
+    <td>Fortran</td>
+    <td>3.0</td>
+    <td><a href="http://www.netlib.org/lapack/lapack.tgz";>LAPACK (testing programs)</a></td>
+    <td><a href="testing-lapack.html">build and test guide</a></td>
+</tr>
+<tr><td><a href="http://www.kernel.org";>Linux kernel</a></td>
+    <td>C</td>
+    <td>2.4.18</td>
+    <td><a
+    href="ftp://ftp.kernel.org/pub/linux/kernel/v2.4/linux-2.4.18.tar.bz2";>
+    linux-2.4.18.tar.gz</a></td>
+    <td>&nbsp;</td>
+</tr>
+<tr><td><a href="http://www.acl.lanl.gov/pooma/";>POOMA</a></td>
+    <td>C++</td>
+    <td>2.3.0</td>
+    <td><a href="ftp://gcc.gnu.org/pub/gcc/infrastructure/pooma-gcc.tar.gz";>pooma-gcc.tar.gz</a></td>
+    <td><a href="testing-pooma.html">build and test guide</a></td>
+</tr>
+</table>
+
+<p>These selections were made for a variety of reasons.  The Linux kernel is
+one of the most important pieces of free software, and kernel developers pay
+careful attention to GCC performance.  It would be an embarrassment if GCC
+did not compile the kernel correctly, out of the box, or if a newly-built
+kernel did not boot or run successfully.  The Linux kernel
+taxes many of the low-level aspects of GCC, as well as many GCC extensions,
+including the extended assembly syntax, addresses of labels, and so forth.
+(Historically, there have been kernel bugs, found only by more aggressive
+optimization in new releases of GCC.  If such bugs are encountered, then
+appropriate patches should be applied to the kernel before testing.)</p>
+
+<p><em>(Encourage building and booting BSD kernels for those targets.)</em></p>
+
+<p>GNU Emacs is portable to almost every system available, and is a complex
+application-level C program, known to have very few bugs.</p>
+
+<p>POOMA is a complex expression-template library that will tax the ability
+of G++ to deal with templates, an area that has historically been buggy.
+In addition, templates have historically taken inordinately much time and
+memory at compile-time.  With the widespread prevalence of templates in
+C++ programs, including the standard library, testing this area heavily is
+vitally important. Pooma-gcc is pooma-2.3.0 plus some scripts which
+simplify testing.</p>
+
+<p>LAPACK is a well known linear algebra package that contains code
+typical for large scale Fortran programs.  The package includes a
+test suite.</p>
+
+<p>The optional integration tests are available to provide additional
+common testing.  We encourage GCC users to provide their applications
+for supplemental integration testing by making available a downloadable
+tarball and submitting build and test instructions that we can link from
+this page.  Those instructions should include information about expected
+test results and how to build the package with optimizations that are
+different from what is normally used for that package.</p>
+
+<table align="center" border="1">
+<caption>Optional Integration Tests</caption>
+<tr><th>Name</th>
+    <th>Language</th>
+    <th>Version</th>
+    <th>Source URL</th>
+    <th>Build and test guide</th>
+</tr>
+<tr><td><a href="http://www.cs.wustl.edu/~schmidt/ACE.html";>ACE</a></td>
+    <td>C++</td>
+    <td>5.2</td>
+    <td><a href="http://deuce.doc.wustl.edu/Download.html";>
+        ACE (download)</a></td>
+    <td>&nbsp;</td>
+</tr>
+<tr><td><a href="http://www.oonumerics.org/blitz/";>Blitz</a></td>
+    <td>C++</td>
+    <td>20001213</td>
+    <td><a href="http://www.oonumerics.org/blitz/download/snapshots/blitz-20001213.tar.gz";>blitz-20001213.tar.gz</a></td>
+    <td><a href="testing-blitz.html">build and test guide</a></td>
+</tr>
+<tr><td><a href="http://www.boost.org/";>Boost</a></td>
+    <td>C++</td>
+    <td>1.22.0</td> <!-- the download link they give here isn't versioned... -->
+    <td><a href="http://www.boost.org/boost_all.tar.gz";>boost_all.tar.gz</a></td>
+    <td><a href="testing-boost.html">build and test guide</a></td>
+</tr>
+<tr><td><a href="http://superbeast.ucsd.edu/~landry/FTensor/";>FTensor</a></td>
+    <td>C++</td>
+    <td>1.1 patch 16</td>
+    <td><a href="http://www.oonumerics.org/FTensor/FTensor_gcc_integration_test.tar.gz";>
+         FTensor_gcc_integration_test.tar.gz</a></td>
+    <td><a href="testing-ftensor.html">build and test guide</a></td>
+</tr>
+<tr><td><a href="http://www.osl.iu.edu/research/mtl/";>MTL</a></td>
+    <td>C++</td>
+    <td>2.12.2.-20</td>
+    <td><a href="http://www.osl.iu.edu/research/mtl/download.php3";>
+        MTL (Download)</a>, with <a href="http://www.osl.iu.edu/MailArchives/mtl-devel/msg00311.php";>patch</a></td>
+    <td>&nbsp;</td>
+</tr>
+<tr><td><a href="http://www.trolltech.com/products/qt/index.html";>Qt</a></td>
+    <td>C++</td>
+    <td>2.3.0</td>
+    <td><a href="ftp://ftp.trolltech.com/qt/source/qt-x11-2.3.0.tar.gz";>qt-x11-2.3.0.tar.gz</a></td>
+    <td><a href="testing-qt.html">build and test guide</a></td>
+</tr>
+<tr><td><a href="http://root.cern.ch/";>root</a></td>
+    <td>C++</td>
+    <td>3.01.00</td>
+    <td><a href="http://root.cern.ch/root/Version301.html";>
+        root-3.01</a></td>
+    <td>&nbsp;</td>
+</tr>
+</table>
+
+<h3>Code Quality</h3>
+
+<p>Historically, there has been no formal release criterion that took into
+account performance of code generated by the compiler.  It is important that
+the generated code performs approximately as well as previous releases.
+Therefore, we will use the following benchmarks for measuring code
+quality:</p>
+
+<table align="center" border="1">
+<caption>Performance Tests</caption>
+<tr><th align="left">Name</th>
+    <th align="left">Language</th>
+    <th align="left">Source URL</th>
+</tr>
+<tr><td>gzip 1.2.4a</td><td>C</td><td>&nbsp;</td>
+</tr>
+<tr><td>Stepanov</td><td>C++</td>
+    <td><a href="ftp://ftp.kai.com/pub/benchmarks/stepanov_v1p2.C";>
+         stepanov_v1p2.C</a></td>
+</tr>
+<tr><td>LAPACK</td><td>Fortran</td>
+    <td><a href="http://www.netlib.org/lapack/lapack.tgz";>
+        LAPACK 3.0 (timing programs)</a></td>
+</tr>
+<tr><td>ecm4c</td><td>C</td>
+    <td><a href="ftp://ftp.loria.fr/pub/loria/eureca/tmp/GMP-ECM/ecm4c.c";>
+        ecm4c.c</a></td>
+</tr>
+</table>
+
+<p><em>(It's not reasonable to require comparison of LAPACK timing programs
+without providing information about how to perform that comparison, which
+isn't easy to find in the LAPACK documentation.)</em></p>
+
+<p><em>(What input should be used for timing gzip performance?  The
+performance depends on the the type of file being compressed.)</em></p>
+
+<p>A Java benchmark is not required for this release since there is little
+precedent for the behavior of the Java compiler.  For Java, functional
+completeness and correctness are still more important than optimization.</p>
+
+<p>In addition to the above benchmarks, the behavior of real
+programs should be considered as well.  For that reason, the
+behavior of the elliptic curve integer factorization program ecm4c,
+which uses GNU mp, will be considered part of the release criteria.
+<em>(If we're asking people to run ecm4c then we should also tell them
+how to download and build GMP.)</em></p>
+
+<p>The performance of the generated code of a release candidate should be
+at least as good as that of past releases of GCC since 2.95.3 on the
+benchmarks, and within at least 5% on the application tests.</p>
+
+<h3>Compile-Time Performance</h3>
+
+<p>There is a perception that current versions of GCC take longer to compile
+programs than their 2.95.3 counterparts, and that they often use more memory
+as well.  Compile-time performance is an important part of compiler quality.
+It is not enough simply to provide additional optimizations; the compiler
+must also continue to compile programs relatively quickly.  However, it
+is to be expected that additional optimizations and additional features
+will have a non-zero cost.</p>
+
+<p>In order to measure compile-time performance, we will use the
+following unit tests:</p>
+<table align="center" border="1">
+<tr><th align="left">Name</th>
+    <th align="left">Language</th>
+    <th align="left">Source</th>
+    <th align="left">Flags</th>
+    <th align="left">Comments</th>
+</tr>
+<tr><td>insn-attrtab.c</td>
+    <td>C</td>
+    <td>&nbsp;</td>
+    <td>-O2</td>
+    <td>This file contains a large machine-generated switch
+        statement; it is a reasonable benchmark for testing flow
+        optimizations and for handling large functions.</td>
+</tr>
+<tr><td>&nbsp;</td>
+    <td>C++</td>
+    <td>&nbsp;</td>
+    <td>&nbsp;</td>
+    <td>&nbsp;</td>
+</tr>
+<tr><td>&nbsp;</td>
+    <td>Fortran</td>
+    <td>&nbsp;</td>
+    <td>&nbsp;</td>
+    <td>&nbsp;</td>
+</tr>
+</table>
+
+<p>In addition to these unit tests, time and peak memory usage used
+when building the entire GNU Emacs distribution should be measured.</p>
+
+<p>A release candidate's compile time should not exceed GCC 2.95.3 by
+more than 15%, and peak memory usage should not exceed that of GCC 2.95.3
+by more than 25%.</p>
+
+</body>
+</html>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]