This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Ping: RFA: speeding up dg-extract-results.sh


Ping for this old patch.  I didn't ping it before because I can imagine
it wasn't a good idea at that stage in 4.8.  But I've been using it locally
since and it really does make a big difference when testing lots of
multilibs.

Thanks,
Richard

Richard Sandiford <rdsandiford@googlemail.com> writes:
> dg-extract-results.sh is used to combine the various .sum.sep and .log.sep
> files produced by parallel testing into single .sum and .log files.
> It's written in a combination of shell scripts and awk, so stays well
> within the minimum system requirements.
>
> However, it seems to be quadratic in the number of test variations,
> since the size of the .sums and .logs are linear in it and the script
> parses them all once per variation.  This means that when I'm doing the
> mipsisa64-sde-elf testing:
>
>     http://gcc.gnu.org/ml/gcc-testresults/2014-02/msg00025.html
>
> the script takes just over 5 hours to produce the gcc.log file.
>
> This patch tries to reduce that by providing an alternative single-script
> version.  I was torn between Python and Tcl, but given how most people
> tend to react to Tcl, I thought I'd better go for Python.  I wouldn't
> mind rewriting it in Tcl if that seems better though, not least because
> expect is already a prerequisite.
>
> Python isn't yet required and I'm pretty sure this script needs 2.6
> or later.  I'm also worried that the seek/tell stuff might not work on
> Windows.  The patch therefore gets dg-extract-results.sh to check the
> environment first and call into the python version if possible,
> otherwise it falls back on the current approach.  This also means
> that the patch is contained entirely within contrib/.  If this does
> indeed not work on Windows then we should either fix the python code
> (obviously preferred) or get dg-extract-results.sh to skip it on
> Windows for now.
>
> The new version processes the mipsisa64-sde-elf gcc.log in just over a minute.
> It's also noticeably faster for more normal runs, e.g. for my 4-variant
> mips64-linux-gnu testing the time taken to process gcc.log goes from 114s
> to 11s.  But that's probably in the noise given how long testing takes anyway.
>
> For completeness, although the basic approach was heavily based on the
> original script, there are some minor differences in output:
>
> - the 'Host is ' line is copied over.
>
> - not all sorts in the .sh version were protected by LC_ALL=C, so the
>   order of .exp files in the .sum could depend on locale.  The new version
>   always follows the LC_ALL=C ordering (since that's what Python uses
>   unless the script forces it not to).
>
> - when the run for a particular .exp is split over several .log.seps,
>   the separate logs are now reassembled in the same order as the .sum
>   output, based on the first test in each .log fragment.  I've left this
>   under the control of an internal variable for easier comparison though.
>
> - the new version tries to keep the earliest start message and latest
>   end message (based on the time in the message).  I thought this would
>   give a better idea how long the full run took.
>
> - the .log output now contains the tool version information at the end
>   (as both versions do for .sum).
>
> - the .log output only contains one set of 'Using foo.exp as the blah.'
>   messages per run.  The .sh version drops most of the others but not all.
>
> I checked that the outputs were otherwise identical for a set of
> mips64-linux-gnu, mipsisa64-sde-elf and x86_64-linux-gnu runs.  I also
> reran the acats tests with some nobbled testcases in order to test the
> failure paths there.
>
> Also bootstrapped & regression-tested on x86_64-linux-gnu.  OK to install?
>
> Thanks,
> Richard

contrib/
	* dg-extract-results.py: New file.
	* dg-extract-results.sh: Use it if the environment seems suitable.

Index: contrib/dg-extract-results.py
===================================================================
--- /dev/null	2014-02-10 23:36:59.384652914 +0000
+++ contrib/dg-extract-results.py	2014-02-13 10:07:24.845979555 +0000
@@ -0,0 +1,577 @@
+#!/usr/bin/python
+#
+# Copyright (C) 2014 Free Software Foundation, Inc.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+import sys
+import getopt
+import re
+from datetime import datetime
+
+# True if unrecognised lines should cause a fatal error.  Might want to turn
+# this on by default later.
+strict = False
+
+# True if the order of .log segments should match the .sum file, false if
+# they should keep the original order.
+sort_logs = True
+
+class Named:
+    def __init__ (self, name):
+        self.name = name
+
+    def __cmp__ (self, other):
+        return cmp (self.name, other.name)
+
+class ToolRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # The variations run for this tool, mapped by --target_board name.
+        self.variations = dict()
+
+    # Return the VariationRun for variation NAME.
+    def get_variation (self, name):
+        if name not in self.variations:
+            self.variations[name] = VariationRun (name)
+        return self.variations[name]
+
+class VariationRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # A segment of text before the harness runs start, describing which
+        # baseboard files were loaded for the target.
+        self.header = None
+        # The harnesses run for this variation, mapped by filename.
+        self.harnesses = dict()
+        # A list giving the number of times each type of result has
+        # been seen.
+        self.counts = []
+
+    # Return the HarnessRun for harness NAME.
+    def get_harness (self, name):
+        if name not in self.harnesses:
+            self.harnesses[name] = HarnessRun (name)
+        return self.harnesses[name]
+
+class HarnessRun (Named):
+    def __init__ (self, name):
+        Named.__init__ (self, name)
+        # Segments of text that make up the harness run, mapped by a test-based
+        # key that can be used to order them.
+        self.segments = dict()
+        # Segments of text that make up the harness run but which have
+        # no recognized test results.  These are typically harnesses that
+        # are completely skipped for the target.
+        self.empty = []
+        # A list of results.  Each entry is a pair in which the first element
+        # is a unique sorting key and in which the second is the full
+        # PASS/FAIL line.
+        self.results = []
+
+    # Add a segment of text to the harness run.  If the segment includes
+    # test results, KEY is an example of one of them, and can be used to
+    # combine the individual segments in order.  If the segment has no
+    # test results (e.g. because the harness doesn't do anything for the
+    # current configuration) then KEY is None instead.  In that case
+    # just collect the segments in the order that we see them.
+    def add_segment (self, key, segment):
+        if key:
+            assert key not in self.segments
+            self.segments[key] = segment
+        else:
+            self.empty.append (segment)
+
+class Segment:
+    def __init__ (self, filename, start):
+        self.filename = filename
+        self.start = start
+        self.lines = 0
+
+class Prog:
+    def __init__ (self):
+        # The variations specified on the command line.
+        self.variations = []
+        # The variations seen in the input files.
+        self.known_variations = set()
+        # The tools specified on the command line.
+        self.tools = []
+        # Whether to create .sum rather than .log output.
+        self.do_sum = True
+        # Regexps used while parsing.
+        self.test_run_re = re.compile (r'^Test Run By (\S+) on (.*)$')
+        self.tool_re = re.compile (r'^\t\t=== (.*) tests ===$')
+        self.result_re = re.compile (r'^(PASS|XPASS|FAIL|XFAIL|UNRESOLVED'
+                                     r'|WARNING|ERROR|UNSUPPORTED|UNTESTED'
+                                     r'|KFAIL):\s*(\S+)')
+        self.completed_re = re.compile (r'.* completed at (.*)')
+        # Pieces of text to write at the head of the output.
+        # start_line is a pair in which the first element is a datetime
+        # and in which the second is the associated 'Test Run By' line.
+        self.start_line = None
+        self.native_line = ''
+        self.target_line = ''
+        self.host_line = ''
+        self.acats_premable = ''
+        # Pieces of text to write at the end of the output.
+        # end_line is like start_line but for the 'runtest completed' line.
+        self.acats_failures = []
+        self.version_output = ''
+        self.end_line = None
+        # Known summary types.
+        self.count_names = [
+            '# of expected passes\t\t',
+            '# of unexpected failures\t',
+            '# of unexpected successes\t',
+            '# of expected failures\t\t',
+            '# of unknown successes\t\t',
+            '# of known failures\t\t',
+            '# of untested testcases\t\t',
+            '# of unresolved testcases\t',
+            '# of unsupported tests\t\t'
+        ]
+        self.runs = dict()
+
+    def usage (self):
+        name = sys.argv[0]
+        sys.stderr.write ('Usage: ' + name
+                          + ''' [-t tool] [-l variant-list] [-L] log-or-sum-file ...
+
+    tool           The tool (e.g. g++, libffi) for which to create a
+                   new test summary file.  If not specified then output
+                   is created for all tools.
+    variant-list   One or more test variant names.  If the list is
+                   not specified then one is constructed from all
+                   variants in the files for <tool>.
+    sum-file       A test summary file with the format of those
+                   created by runtest from DejaGnu.
+    If -L is used, merge *.log files instead of *.sum.  In this
+    mode the exact order of lines may not be preserved, just different
+    Running *.exp chunks should be in correct order.
+''')
+        sys.exit (1)
+
+    def fatal (self, what, string):
+        if not what:
+            what = sys.argv[0]
+        sys.stderr.write (what + ': ' + string + '\n')
+        sys.exit (1)
+
+    # Parse the command-line arguments.
+    def parse_cmdline (self):
+        try:
+            (options, self.files) = getopt.getopt (sys.argv[1:], 'l:t:L')
+            if len (self.files) == 0:
+                self.usage()
+            for (option, value) in options:
+                if option == '-l':
+                    self.variations.append (value)
+                elif option == '-t':
+                    self.tools.append (value)
+                else:
+                    self.do_sum = False
+        except getopt.GetoptError as e:
+            self.fatal (None, e.msg)
+
+    # Try to parse time string TIME, returning an arbitrary time on failure.
+    # Getting this right is just a nice-to-have so failures should be silent.
+    def parse_time (self, time):
+        try:
+            return datetime.strptime (time, '%c')
+        except ValueError:
+            return datetime.now()
+
+    # Parse an integer and abort on failure.
+    def parse_int (self, filename, value):
+        try:
+            return int (value)
+        except ValueError:
+            self.fatal (filename, 'expected an integer, got: ' + value)
+
+    # Return a list that represents no test results.
+    def zero_counts (self):
+        return [0 for x in self.count_names]
+
+    # Return the ToolRun for tool NAME.
+    def get_tool (self, name):
+        if name not in self.runs:
+            self.runs[name] = ToolRun (name)
+        return self.runs[name]
+
+    # Add the result counts in list FROMC to TOC.
+    def accumulate_counts (self, toc, fromc):
+        for i in range (len (self.count_names)):
+            toc[i] += fromc[i]
+
+    # Parse the list of variations after 'Schedule of variations:'.
+    # Return the number seen.
+    def parse_variations (self, filename, file):
+        num_variations = 0
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'could not parse variation list')
+            if line == '\n':
+                break
+            self.known_variations.add (line.strip())
+            num_variations += 1
+        return num_variations
+
+    # Parse from the first line after 'Running target ...' to the end
+    # of the run's summary.
+    def parse_run (self, filename, file, tool, variation, num_variations):
+        header = None
+        harness = None
+        segment = None
+        final_using = 0
+
+        # If this is the first run for this variation, add any text before
+        # the first harness to the header.
+        if not variation.header:
+            segment = Segment (filename, file.tell())
+            variation.header = segment
+
+        # Parse up until the first line of the summary.
+        if num_variations == 1:
+            end = '\t\t=== ' + tool.name + ' Summary ===\n'
+        else:
+            end = ('\t\t=== ' + tool.name + ' Summary for '
+                   + variation.name + ' ===\n')
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'no recognised summary line')
+            if line == end:
+                break
+
+            # Look for the start of a new harness.
+            if line.startswith ('Running ') and line.endswith (' ...\n'):
+                # Close off the current harness segment, if any.
+                if harness:
+                    segment.lines -= final_using
+                    harness.add_segment (first_key, segment)
+                name = line[len ('Running '):-len(' ...\n')]
+                harness = variation.get_harness (name)
+                segment = Segment (filename, file.tell())
+                first_key = None
+                final_using = 0
+                continue
+
+            # Record test results.  Associate the first test result with
+            # the harness segment, so that if a run for a particular harness
+            # has been split up, we can reassemble the individual segments
+            # in a sensible order.
+            match = self.result_re.match (line)
+            if match:
+                if not harness:
+                    self.fatal (filename, 'saw test result before harness name')
+                name = match.group (2)
+                # Ugly hack to get the right order for gfortran.
+                if name.startswith ('gfortran.dg/g77/'):
+                    name = 'h' + name
+                key = (name, len (harness.results))
+                harness.results.append ((key, line))
+                if not first_key and sort_logs:
+                    first_key = key
+
+            # 'Using ...' lines are only interesting in a header.  Splitting
+            # the test up into parallel runs leads to more 'Using ...' lines
+            # than there would be in a single log.
+            if line.startswith ('Using '):
+                final_using += 1
+            else:
+                final_using = 0
+
+            # Add other text to the current segment, if any.
+            if segment:
+                segment.lines += 1
+
+        # Close off the final harness segment, if any.
+        if harness:
+            segment.lines -= final_using
+            harness.add_segment (first_key, segment)
+
+        # Parse the rest of the summary (the '# of ' lines).
+        if len (variation.counts) == 0:
+            variation.counts = self.zero_counts()
+        while True:
+            before = file.tell()
+            line = file.readline()
+            if line == '':
+                break
+            if line == '\n':
+                continue
+            if not line.startswith ('# '):
+                file.seek (before)
+                break
+            found = False
+            for i in range (len (self.count_names)):
+                if line.startswith (self.count_names[i]):
+                    count = line[len (self.count_names[i]):-1].strip()
+                    variation.counts[i] += self.parse_int (filename, count)
+                    found = True
+                    break
+            if not found:
+                self.fatal (filename, 'unknown test result: ' + line[:-1])
+
+    # Parse an acats run, which uses a different format from dejagnu.
+    # We have just skipped over '=== acats configuration ==='.
+    def parse_acats_run (self, filename, file):
+        # Parse the preamble, which describes the configuration and logs
+        # the creation of support files.
+        record = (self.acats_premable == '')
+        if record:
+            self.acats_premable = '\t\t=== acats configuration ===\n'
+        while True:
+            line = file.readline()
+            if line == '':
+                self.fatal (filename, 'could not parse acats preamble')
+            if line == '\t\t=== acats tests ===\n':
+                break
+            if record:
+                self.acats_premable += line
+
+        # Parse the test results themselves, using a dummy variation name.
+        tool = self.get_tool ('acats')
+        variation = tool.get_variation ('none')
+        self.parse_run (filename, file, tool, variation, 1)
+
+        # Parse the failure list.
+        while True:
+            before = file.tell()
+            line = file.readline()
+            if line.startswith ('*** FAILURES: '):
+                self.acats_failures.append (line[len ('*** FAILURES: '):-1])
+                continue
+            file.seek (before)
+            break
+
+    # Parse the final summary at the end of a log in order to capture
+    # the version output that follows it.
+    def parse_final_summary (self, filename, file):
+        record = (self.version_output == '')
+        while True:
+            line = file.readline()
+            if line == '':
+                break
+            if line.startswith ('# of '):
+                continue
+            if record:
+                self.version_output += line
+            if line == '\n':
+                break
+
+    # Parse a .log or .sum file.
+    def parse_file (self, filename, file):
+        tool = None
+        target = None
+        num_variations = 1
+        while True:
+            line = file.readline()
+            if line == '':
+                return
+
+            # Parse the list of variations, which comes before the test
+            # runs themselves.
+            if line.startswith ('Schedule of variations:'):
+                num_variations = self.parse_variations (filename, file)
+                continue
+
+            # Parse a testsuite run for one tool/variation combination.
+            if line.startswith ('Running target '):
+                name = line[len ('Running target '):-1]
+                if not tool:
+                    self.fatal (filename, 'could not parse tool name')
+                if name not in self.known_variations:
+                    self.fatal (filename, 'unknown target: ' + name)
+                self.parse_run (filename, file, tool,
+                                tool.get_variation (name),
+                                num_variations)
+                # If there is only one variation then there is no separate
+                # summary for it.  Record any following version output.
+                if num_variations == 1:
+                    self.parse_final_summary (filename, file)
+                continue
+
+            # Parse the start line.  In the case where several files are being
+            # parsed, pick the one with the earliest time.
+            match = self.test_run_re.match (line)
+            if match:
+                time = self.parse_time (match.group (2))
+                if not self.start_line or self.start_line[0] > time:
+                    self.start_line = (time, line)
+                continue
+
+            # Parse the form used for native testing.
+            if line.startswith ('Native configuration is '):
+                self.native_line = line
+                continue
+
+            # Parse the target triplet.
+            if line.startswith ('Target is '):
+                self.target_line = line
+                continue
+
+            # Parse the host triplet.
+            if line.startswith ('Host   is '):
+                self.host_line = line
+                continue
+
+            # Parse the acats premable.
+            if line == '\t\t=== acats configuration ===\n':
+                self.parse_acats_run (filename, file)
+                continue
+
+            # Parse the tool name.
+            match = self.tool_re.match (line)
+            if match:
+                tool = self.get_tool (match.group (1))
+                continue
+
+            # Skip over the final summary (which we instead create from
+            # individual runs) and parse the version output.
+            if tool and line == '\t\t=== ' + tool.name + ' Summary ===\n':
+                if file.readline() != '\n':
+                    self.fatal (filename, 'expected blank line after summary')
+                self.parse_final_summary (filename, file)
+                continue
+
+            # Parse the completion line.  In the case where several files
+            # are being parsed, pick the one with the latest time.
+            match = self.completed_re.match (line)
+            if match:
+                time = self.parse_time (match.group (1))
+                if not self.end_line or self.end_line[0] < time:
+                    self.end_line = (time, line)
+                continue
+
+            # Sanity check to make sure that important text doesn't get
+            # dropped accidentally.
+            if strict and line.strip() != '':
+                self.fatal (filename, 'unrecognised line: ' + line[:-1])
+
+    # Output a segment of text.
+    def output_segment (self, segment):
+        with open (segment.filename, 'r') as file:
+            file.seek (segment.start)
+            for i in range (segment.lines):
+                sys.stdout.write (file.readline())
+
+    # Output a summary giving the number of times each type of result has
+    # been seen.
+    def output_summary (self, tool, counts):
+        for i in range (len (self.count_names)):
+            name = self.count_names[i]
+            # dejagnu only prints result types that were seen at least once,
+            # but acats always prints a number of unexpected failures.
+            if (counts[i] > 0
+                or (tool.name == 'acats'
+                    and name.startswith ('# of unexpected failures'))):
+                sys.stdout.write ('%s%d\n' % (name, counts[i]))
+
+    # Output unified .log or .sum information for a particular variation,
+    # with a summary at the end.
+    def output_variation (self, tool, variation):
+        self.output_segment (variation.header)
+        for harness in sorted (variation.harnesses.values()):
+            sys.stdout.write ('Running ' + harness.name + ' ...\n')
+            if self.do_sum:
+                # Keep the original test result order if there was only
+                # one segment for this harness.  This is needed for
+                # unsorted.exp, which has unusual test names.  Otherwise
+                # sort the tests by test filename.  If there are several
+                # subtests for the same test filename (such as 'compilation',
+                # 'test for excess errors', etc.) then keep the subtests
+                # in the original order.
+                if len (harness.segments) > 1:
+                    harness.results.sort()
+                for (key, line) in harness.results:
+                    sys.stdout.write (line)
+            else:
+                # Rearrange the log segments into test order (but without
+                # rearranging text within those segments).
+                for key in sorted (harness.segments.keys()):
+                    self.output_segment (harness.segments[key])
+                for segment in harness.empty:
+                    self.output_segment (segment)
+        if len (self.variations) > 1:
+            sys.stdout.write ('\t\t=== ' + tool.name + ' Summary for '
+                              + variation.name + ' ===\n\n')
+            self.output_summary (tool, variation.counts)
+
+    # Output unified .log or .sum information for a particular tool,
+    # with a summary at the end.
+    def output_tool (self, tool):
+        counts = self.zero_counts()
+        if tool.name == 'acats':
+            # acats doesn't use variations, so just output everything.
+            # It also has a different approach to whitespace.
+            sys.stdout.write ('\t\t=== ' + tool.name + ' tests ===\n')
+            for variation in tool.variations.values():
+                self.output_variation (tool, variation)
+                self.accumulate_counts (counts, variation.counts)
+            sys.stdout.write ('\t\t=== ' + tool.name + ' Summary ===\n')
+        else:
+            # Output the results in the usual dejagnu runtest format.
+            sys.stdout.write ('\n\t\t=== ' + tool.name + ' tests ===\n\n'
+                              'Schedule of variations:\n')
+            for name in self.variations:
+                if name in tool.variations:
+                    sys.stdout.write ('    ' + name + '\n')
+            sys.stdout.write ('\n')
+            for name in self.variations:
+                if name in tool.variations:
+                    variation = tool.variations[name]
+                    sys.stdout.write ('Running target '
+                                      + variation.name + '\n')
+                    self.output_variation (tool, variation)
+                    self.accumulate_counts (counts, variation.counts)
+            sys.stdout.write ('\n\t\t=== ' + tool.name + ' Summary ===\n\n')
+        self.output_summary (tool, counts)
+
+    def main (self):
+        self.parse_cmdline()
+        try:
+            # Parse the input files.
+            for filename in self.files:
+                with open (filename, 'r') as file:
+                    self.parse_file (filename, file)
+
+            # Decide what to output.
+            if len (self.variations) == 0:
+                self.variations = sorted (self.known_variations)
+            else:
+                for name in self.variations:
+                    if name not in self.known_variations:
+                        self.fatal (None, 'no results for ' + name)
+            if len (self.tools) == 0:
+                self.tools = sorted (self.runs.keys())
+
+            # Output the header.
+            if self.start_line:
+                sys.stdout.write (self.start_line[1])
+            sys.stdout.write (self.native_line)
+            sys.stdout.write (self.target_line)
+            sys.stdout.write (self.host_line)
+            sys.stdout.write (self.acats_premable)
+
+            # Output the main body.
+            for name in self.tools:
+                if name not in self.runs:
+                    self.fatal (None, 'no results for ' + name)
+                self.output_tool (self.runs[name])
+
+            # Output the footer.
+            if len (self.acats_failures) > 0:
+                sys.stdout.write ('*** FAILURES: '
+                                  + ' '.join (self.acats_failures) + '\n')
+            sys.stdout.write (self.version_output)
+            if self.end_line:
+                sys.stdout.write (self.end_line[1])
+        except IOError as e:
+            self.fatal (e.filename, e.strerror)
+
+Prog().main()
Index: contrib/dg-extract-results.sh
===================================================================
--- contrib/dg-extract-results.sh	2014-02-13 09:18:48.270283298 +0000
+++ contrib/dg-extract-results.sh	2014-02-13 10:07:24.846979564 +0000
@@ -28,6 +28,15 @@
 
 PROGNAME=dg-extract-results.sh
 
+# Try to use the python version if possible, since it tends to be faster.
+PYTHON_VER=`echo "$0" | sed 's/sh$/py/'`
+if test "$PYTHON_VER" != "$0" &&
+   test -f "$PYTHON_VER" &&
+   python -c 'import sys; sys.exit (0 if sys.version_info >= (2, 6) else 1)' \
+     > /dev/null 2> /dev/null; then
+  exec python $PYTHON_VER "$@"
+fi
+
 usage() {
   cat <<EOF >&2
 Usage: $PROGNAME [-t tool] [-l variant-list] [-L] sum-file ...


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]