This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Identifying source file locations.

To: kalmquist2 at hotmail dot com (Kenneth Almquist)
Subject: Re: Identifying source file locations.
From: Daniel Berlin <dan at cgsoftware dot com>
Date: 09 Jun 2001 21:48:43 -0400
Cc: Daniel Berlin <dan at cgsoftware dot com>, gcc at gcc dot gnu dot org
References: <200106100119.f5A1JYm24069@mail.monmouth.com>

kalmquist2@hotmail.com (Kenneth Almquist) writes:

> Daniel Berlin wrote:
>> Errr, DWARF2 does it fine.
> 

> The document isn't as clear about how these values should be used
> as one might like. 
Err, how so?

It's pretty clear on the definitions.

For instance:
"Basic block -  A sequence of instructions that is entered only at the
first instruction and exited only at the last instruction. We define a
procedure invocation to be an exit from a basic block.
"

>  My interpretation is that "line" and "column"
> identify the start of the statement.  

Not at all. They identify the line and column of an address.
There is no assumption that a statement produces a single address.
That's what is_stmt is for.
>From the draft:
"
address              The program-counter value corresponding to a
machine instruction generated by the compiler.

file                 An unsigned integer indicating the identity of
the source file corresponding to a machine nstruction.

line                 An unsigned integer indicating a source line
number. Lines are numbered beginning at 1. The compiler may emit the
value 0 in cases where an instruction cannot be attributed to any
source line. 

column               An unsigned integer indicating a column number
within a source line. Columns are numbered beginning at 1. The value 0
is reserved to indicate that  a statement begins at the "left edge" of
the line. 
"

Why do you need to make up your own definition of column and line,
what's unclear baout the one above?

> To save space, "column" should
> be set to zero for first statement on a line.
>  Normally, an entry is
> output only for the first instruction generated for a statement.
>  If
> the instructions generated for a statement are not contiguous (this
> could occur as the result of certain optimizations), then an entry is
> created for each sequence of contiguous statements, with "is_stmt"
> set to false for all sequences except the first. 
>From dwarf2out.c:
/* Flag that indicates the initial value of the is_stmt_start flag.
   In the present implementation, we do not mark any lines as
   the beginning of a source statement, because that information
   is not made available by the GCC front-end.  */

> I must confess I'm
> at a loss to understand the purpose of the "basic_block" flag, so I
> assume that it can always be false.

Errr, it *could*, but it shouldn't be. I believe it always is for us.

Here's the info on the basic block flag:
"basic_block        A boolean indicating that the current instruction
is the beginning of a basic block."
> 
> Currently, the RTL notes specify the file and line number of the
> last token of a statement rather than the first, so that if I
> understand DWARF correctly, the DWARF file, line, and column
> values all need to be added to the RTL.

Or rather, just associate each rtl with a file, line, and column, and
get rid of the notes.

Saves the trouble of trying to duplicate line notes in the right
places, etc.

You did read the part about how is_stmt is supposed to actually now
represent a recommended breakpoint location, rather than specifically
a statement.

"
is_stmt            A boolean indicating that the current instruction
is a recommended breakpoint location. A recommended breakpoint
location is intended to "represent" a line, a statement and/or a
semantically distinct subpart of a statement. 
"

> 
>> When a statement spans multiple lines, we should say, in the debugging
>> info, and the RTL, that it spans multiple lines.
>>
>> This is important for optimized code debugging.
>>
>> This isn't that tricky to do, either, it just means we have to label
>> the INSN's with the line number they are for, rather than add notes
>> with the line number in front of them.
>>
>> That way, when some optimization moves some code, we still know where
>> it really originated.
> 
> I am under the impression that optimizations which move code around
> are supposed to clone the "notes" specifying the line information
> when necessary to make sure that the line number information for
> the INSN's doesn't change.  I suppose that someone will correct me
> if I'm mistaken about this.

Sure, they are *supposed* to.

Some do, some don't.

Most that do, don't do it perfectly.
If we just added 10 or 12 byte to each INSN (assuming you have < 65535
files per compilation, you can do it with 10, but it's not really a savings,
since it'll probably be realigned to 12 anyway), we could remove the
line number notes altogether, and the code to copy/track them around,
and the need for optimizations to track them around, unless they
actually split insns.

Right now the simple act of *moving* an insn screws up line number
info.
This is wrong.
It's not something the optimizers should have to worry about, either.
It should just work.

> 				Kenneth Almquist

-- 
"I'm kinda tired.  I was up all night trying to round off
infinity.  Then I got bored and went out and painted passing
lines on curved roads.
"-Steven Wright

Follow-Ups:
- Re: Identifying source file locations.
  - From: Kenneth Almquist

References:
- Re: Identifying source file locations.
  - From: Kenneth Almquist

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]