This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Identifying source file locations.
- To: Daniel Berlin <dan at cgsoftware dot com>
- Subject: Re: Identifying source file locations.
- From: kalmquist2 at hotmail dot com (Kenneth Almquist)
- Date: Sat, 9 Jun 2001 21:19:34 -0400 (EDT)
- Cc: gcc at gcc dot gnu dot org
Daniel Berlin wrote:
> Errr, DWARF2 does it fine.
I hadn't seen the DWARF specification until I got your letter. I
found a draft of the DWARF version 1.1 specification, as well as
draft 6 of the DWARF 2.0 specification, and the line numbering
section is the same in both. For the benefit of people who haven't
seen it, the line number information (as given in the specification
of the state machine) is:
address The program-counter value corresponding to a
machine instruction generated by the compiler.
file An unsigned integer indicating the identity of
the source file corresponding to a machine
instruction.
line An unsigned integer indicating a source line
number. Lines are numbered beginning at 1.
The compiler may emit the value 0 in cases
where an instruction cannot be attributed to
any source line.
column An unsigned integer indicating a column number
within a source line. Columns are numbered
beginning at 1. The value 0 is reserved to
indicate that a statement begins at the "left
edge" of the line.
is_stmt A boolean indicating that the current
instruction is the beginning of a statement.
basic_block A boolean indicating that the current
instruction is the beginning of a basic block.
end_sequence A boolean indicating that the current address
is that of the first byte after the end of a
sequence of target machine instructions.
The document isn't as clear about how these values should be used
as one might like. My interpretation is that "line" and "column"
identify the start of the statement. To save space, "column" should
be set to zero for first statement on a line. Normally, an entry is
output only for the first instruction generated for a statement. If
the instructions generated for a statement are not contiguous (this
could occur as the result of certain optimizations), then an entry is
created for each sequence of contiguous statements, with "is_stmt"
set to false for all sequences except the first. I must confess I'm
at a loss to understand the purpose of the "basic_block" flag, so I
assume that it can always be false.
Currently, the RTL notes specify the file and line number of the
last token of a statement rather than the first, so that if I
understand DWARF correctly, the DWARF file, line, and column
values all need to be added to the RTL.
> When a statement spans multiple lines, we should say, in the debugging
> info, and the RTL, that it spans multiple lines.
>
> This is important for optimized code debugging.
>
> This isn't that tricky to do, either, it just means we have to label
> the INSN's with the line number they are for, rather than add notes
> with the line number in front of them.
>
> That way, when some optimization moves some code, we still know where
> it really originated.
I am under the impression that optimizations which move code around
are supposed to clone the "notes" specifying the line information
when necessary to make sure that the line number information for
the INSN's doesn't change. I suppose that someone will correct me
if I'm mistaken about this.
Kenneth Almquist