Designs for better debug info in GCC

Wed Dec 19 08:39:00 GMT 2007

On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote:
>
> > Consider PRE alone,
>
> > If your debug statement strategy is "move debug statements when we
> > insert code that is equivalent"
>
> Move?  Debug statements don't move, in general.  I'm not sure what you
> have in mind, but I sense some disconnect here.

OKay, so if you aren't going to move them, you have to erase them when
you move statements around.

>
> > because our equivalence is based on value equivalence, not location
> > equivalence.  We only guarantee it has the same value as the
> > whatever it is a copy of at that point, not that it has the same
> > location.

This  is just a problem with an initial state and some propagation at
each statement.
How were you going to generate the initial set of debug annotations?
This is how you get your initial state for your dataflow problem
How were you going to update it if you saw a statement was updated to
say x_5 = x_4 instead of x_5 = x_3 + x_2.
The same operation you perform to update your annotations when you see
 x_5 = x_4 works whether you started with x_5 = x_3 + x_2 or not (it
better, or else your updating will give different results for the same
IR depending on how you got there, which is *incredibly* bad).

So then how will using your debug annotations and updating them come
out any different than say performing a value numbering pass where you
also associate user variables with the ssa names (IE alongside our
value numbers), and propagate them around as well?

If you want to associate multiple user variables with a single SSA
definition point, you can do that as well (use union instead of copy).
You can do whatever you think is best at phi nodes (empty set if user
var sets are not equal, or union them or intersect them).

At the end, you could emit DEBUG(user var, ssa name) right after each
SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name.

The right DEBUG statements would then appear at the points you can
guarantee the user variable has the same *value* as the gimple
register you've said it does.
>From there, it is up to you to do what you like with the result.

(it's late, so i may have described/ calculated the dataflow problem
backwards, but you get the idea)

This is, after all, more or less what PRE does for it's value
numbering. It computes which things have the same value at what points
in the program, then uses this after computing some more dataflow
problems that say where this implies reuse.

I don't see why you believe user variables/bindings are special and
can't be propagated in this manner, given that you can't depend on the
type of statement change that has occurred, only what the IR looks
like after the statement change.  Otherwise, again, the same IR and
source may have different debug annotations depending on the set of
changes you applied to get that IR from the initial IR, which is not
good the standard reasons [maintainability, determinism,
reproducibility, etc].
>
> > #3 is a dataflow problem, and not something you want to do every time
> > you insert a call.
>
> I'm not sure what you mean by "inserting calls".  We don't do that.

Sure we do.
We will definitely insert new calls when we PRE const/pure calls, or
calls we determine to be movable to the point we want to move them
(using call clobbered results, etc).
This will insert calls in latch blocks, above loops, in branch conditions
This is not just movement.
It is insertion of calls that did not exist in the source code at a
given point, but are allowed to be executed at that point in the
source code anyway.

> Calls are present in the source code (even when implied by stuff like
> TLS, OpenMP or builtins such as memcpy), and they're either kept
> around, eliminated or inlined.
No, we can and will insert new calls.
Not just for PRE, but for profiling, devirtualization, struct reorg, SRA, etc
struct reorg inserts new mallocs and frees
profiling inserts profiling calls
devirt will insert branches and new calls to replace virtual function calls
SRA will insert memcpys to and from structures that were not there in
user source before.
i could go on if you like.
I'm not sure why you believe all the calls that we end up with in the
IR are actually in the source (or even implied by it).

>
> But I'm not computing that in trees.  I'm just collecting and
> maintaining data points for var-tracking, all the way from the tree
> level.
Okay, then for trees,  why bother tracking it when you can compute it
right before translation with the same accuracy you can if you update
it every time you make statement changes?
>
> > All you have done is annotated the IR in some places to make explicit
> > some bits in the dataflow problem that you could inference anyway.
>
> Now, this is not true.  I could infer values, yes, but I couldn't
> infer the variables they relate to, nor the point of binding

See above.

>  And
> debug information is not just about the values, it's about mapping
> variables to values and locations.

You have no locations at the tree level, and i've explicitly said what
i said applies to the tree level :)
> So, we can't infer all the
> information we need.

Again, i believe we can at the tree level.