This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[RFC] arc profiling files
- From: Nathan Sidwell <nathan at codesourcery dot com>
- To: gcc at gcc dot gnu dot org, jh at suse dot cz
- Date: Tue, 13 Aug 2002 13:58:46 +0100
- Subject: [RFC] arc profiling files
- Organization: Codesourcery LLC
Hi,
Jan and I have been discussing the arc profiling files. The existing
implementation has a few shortcomings which we'd like to fix.
1) There is no version information
2) Only one file has a magic ident
3) There are two compiler output files, which only make sense together
4) They are not (very) extensible (see 1).
We think some kind of tagged record structure would be better, and
to that end I attach a draft header file which describes the format.
This is not something that needs to be done before 3.3. I do not think
it necessary to maintain backwards compatibility in these formats
between gcc versions -- but we do need to detect mismatches. You'll
see that I'd like to use the gcc version number to version these
files.
comments?
nathan
--
Dr Nathan Sidwell :: http://www.codesourcery.com :: CodeSourcery LLC
'But that's a lie.' - 'Yes it is. What's your point?'
nathan@codesourcery.com : http://www.cs.bris.ac.uk/~nathan/ : nathan@acm.org
/* File format for coverage information
Copyright (C) 2002 Free Software Foundation, Inc.
Contributed by Nathan Sidwell <nathan@codesourcery.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA. */
/* Coverage information is held in two files. A basic block graph
file, which is generated by the compiler, and a counter file, which
is generated by the program under test. Both files use a similar
structure. We do not attempt to make these files backwards
compatible with previous versions, as you only need converage
information when developing a program. We do hold version
information, so that mismatches can be detected, and we use a
format that allows tools to skip information they do not understand
or are not interested in.
Numbers are recorded in little endian unsigned binary form. Either
in 32 or 64 bits. Strings are stored with a length count and NUL
terminator, and padded up to the next 4 byte boundary.
int32: char char char char
int64: char char char char char char char char
string: int32:length char* 0..0 // padded to 32bit boundary
item: int32 | int64 | string
The basic format of the files is
file : int32:magic int32:version record*
The magic ident is different for the bbg and the counter files. The
version is the same for both files and is derived from gcc's
version number. Although the ident and version are formally 32 bit
numbers, they are derived from 4 character ASCII strings. The
version number consists of the single character major version
number, a two character minor version number (leading zero for
versions less than 10), and a single character indicating the
status of the release. That will be 'e' experimental, 'p'
prerelease and 'r' for release. For gcc 3.4 experimental, it would
be '304e' (0x65343033).
A record has a tag, length and variable amount of data.
record: header data
header: int32:tag int32:length
data: item*
Records are not nested, record ordering is significant. Tags are
unique across bbg and da files. A particular record type may have a
varying length. The LENGTH can be used to determine that.
The basic block graph file contains the following records
bbg: {announce_function basic_blocks arcs* lines*}*
announce_function: header string:name int32:checksum
basic_block: header int32:flags*
arcs: header int32:dest_block int32:flags
lines: header {int32:line_no | int32:0 string:filename}*
The BASIC_BLOCK record holds per-bb flags. The number of blocks can
be inferred from its data length. There is one ARCS record per
basic block. These are implicitly numbered. The number of arcs from
a bb is implicit from the data length. It enumerates the
destination bb and per-arc flags. There is one LINES record per
basic block, it enumerates the source lines which belong to that
basic block. Source file names are introduced by a line number of
0, following lines are from the new source file. The initial
source file for the function is NULL, but the current source file
should be remembered from one block to the next.
The data file contains the following records.
da: {{announce_function arc_counts}* summary:object summary:program}*
announce_function: header string:name int32:checksum
arc_counts: header int64*
summary: int32:arc_count int64:sum int64:max
The ANNOUNCE_FUNCTION record is the same as that in the BBG
file. The ARC_COUNTS gives the counter values for those arcs that
are instrumented. The SUMMARY records give information about the
whole object file and about the whole program. Note that the da
file might contain information from several runs concatenated, or
the data might be merged. */
#ifndef GCC_COVERAGE_H
#define GCC_COVERAGE_H
#if BUILDING_GCC /* XXX need to figure this define */
typedef unsigned HOST_WIDEST_INT gcov_type;
#elsif LONG_TYPE_SIZE == GCOV_TYPE_SIZE
typedef unsigned long gcov_type;
#else
typedef unsigned long long gcov_type;
#endif
/* File magic */
#define GCOV_MAGIC 0x766f6367 /* "gcov" in LE ASCII */
#define GBBG_MAGIC 0x67626267 /* "gbbg" in LE ASCII */
#define GCOV_SUFFIX ".da"
#define GBBG_SUFFIC ".bbg"
#include "coverage_version.h"
/* coverage_version.h needs to be generated at configure time from
version.c. It's a little awkward to write the shell script to do
that. A simpler mechanism would be to generate version.c (and
version.h) from some other meta-data file. Something like
--meta-version.h
#define MAJOR "3"
#define MINOR "3"
#define DATE "20020813"
#define STATUS "experimental"
----meta-version.c
#include "meta-version.h"
....
this could generate the f/version.c file too.
*/
/* I don't think there any need to apply structure to the tage
numbering. */
#define COVERAGE_FUNCTION 0x00000001
#define COVERAGE_BLOCKS 0x00000002
#define COVERAGE_ARCS 0x00000003
#define COVERAGE_LINES 0x00000004
#define COVERAGE_ARC_COUNTS 0x00000005
#define COVERAGE_OBJECT_SUMMARY 0x00000006
#define COVERAGE_PROGRAM_SUMMARY 0x00000007
/* Basic block flags. */
#define COVERAGE_BLOCK_UNEXPECTED (1 << 0)
/* Arc flags. */
#define COVERAGE_ARC_ON_TREE (1 << 0)
#define COVERAGE_ARC_FAKE (1 << 1)
#define COVERAGE_ARC_FALLTHROUGH (1 << 2)
/* Functions for reading and writing coverage files. */
void write_int32 PARAMS((FILE *, unsigned long));
void write_int64 PARAMS((FILE *, gcov_type));
void write_string PARAMS((FILE *, unsigned, const char *));
unsigned long read_int32 PARAMS((FILE *));
gcov_type read_int64 PARAMS((FILE *));
char *read_string PARAMS((FILE *, unsigned *));
#ifdef BUILDING_LIBGCC
/* Structures embedded in coveraged program. */
/* Information about a single function. */
struct function_info
{
const char *name; /* (mangled) name of function. */
long checksum; /* function checksum */
int n_arc_counts; /* number of instrumented arcs. */
};
/* Information about a single object file. */
struct coverage_info
{
unsigned long version; /* expected version number. */
struct coverage_info *next; /* link to next, used by libgcc */
const char *filename; /* output file name. */
struct function_info *functions; /* table of functions. */
long n_functions; /* number of functions. */
gcov_type *arc_counts; /* table of arc counts. */
long n_arc_counts; /* number of arc counts. */
/* Should we include compilation flags? */
};
#endif /* BUILDING_LIBGCC */
#endif /* GCC_COVERAGE_H */