This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Designs for better debug info in GCC


On Dec 19, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote:

> On 12/19/07, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Dec 18, 2007, "Daniel Berlin" <dberlin@dberlin.org> wrote:
>> 
>> > Consider PRE alone,
>> 
>> > If your debug statement strategy is "move debug statements when we
>> > insert code that is equivalent"
>> 
>> Move?  Debug statements don't move, in general.  I'm not sure what you
>> have in mind, but I sense some disconnect here.

> OKay, so if you aren't going to move them, you have to erase them when
> you move statements around.

Why?  They still represent the point of binding between user variable
and value.

> How were you going to generate the initial set of debug annotations?

It's in the document: after each assignment to user variable, and at
PHI nodes for user variables.  The debug statement means the variable
holds that value from that point on until conflicting information
arises (i.e., another debug statement for the same variable, or a
control flow merge with different values for the same variable)

> How were you going to update it if you saw a statement was updated to
> say x_5 = x_4 instead of x_5 = x_3 + x_2.

No update needed, if x_5 is the value of interest.  I'm not sure
that's what you're asking, though.

> So then how will using your debug annotations and updating them come
> out any different than say performing a value numbering pass where you
> also associate user variables with the ssa names (IE alongside our
> value numbers), and propagate them around as well?

First, debug annotations may be at different points than the
corresponding SSA definitions, because the same SSA definition may be
bound to different variables at different ranges.

Second, debug annotations may contain more complex expressions than a
single SSA name, and there may not be any SSA name that represents the
value of these expressions left.  For example, given:

  x_3 = a_1 + b_2;
  # DEBUG x => x_3
  foo();

if we find that x_3 is unused elsewhere, we can drop it without
discarding debug information about the value of x at that point

  # DEBUG x => a_1 + b_2
  foo();

such that, if we stop at the call and print x, we get the expected
value, even though the actual variable was optimized away.

> At the end, you could emit DEBUG(user var, ssa name) right after each
> SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name.

This doesn't work.  Consider:

  a_2 = whatever1;
  b_4 = whatever2;

  x_1 = a_2;
  probe();

  if (condition) {
    probe();
    x_3 = b_4;
    probe();
  }

  x_5 = PHI <x_1(!condition), x_3(condition)>;
  probe();

Now, if you optimize it and apply the debug stmt generation
technique you suggested, this is what you get:

  T_2 = whatever1;
  # DEBUG a => T_2
  # DEBUG x => T_2
  T_4 = whatever2;
  # DEBUG b => T_4
  # DEBUG x => T_4

  probe();

  if (condition) {
    probe();
    probe();
  }

  T_5 = PHI <T_2(!condition), T_4(condition)>
  # DEBUG x => T_5
  probe();

What do you get if you print x at each of the probe points?

> I don't see why you believe user variables/bindings are special and
> can't be propagated in this manner,

It's not that I don't believe it, it's just that just being able to
propagate them is not enough.  We must also take the binding point
into account.

Now, as I wrote to Ian last night, if we just add a binding point
annotation to this mix, then we have sufficient information:

  T_2 = whatever1;
  # DEBUG a => T_2 here
  # DEBUG x => T_2 at P1
  T_4 = whatever2;
  # DEBUG b => T_4 here
  # DEBUG x => T_4 at P2

  probe();
  # DEBUG point P1

  if (condition) {
    probe();
    # DEBUG point P2
    probe();
  }

  T_5 = PHI <T_2(!condition), T_4(condition)>
  # DEBUG x => T_5
  probe();


I still don't see how, in this notation, we'd represent something like
"at this point, the value of this user variable is unknown".  Any
ideas?

Also, this strategy works for the nice and well-behaved Tree SSA
optimization passes.  For RTL, that is far less abstract, especially
after register allocation, I don't see that we can rely on such a
simple strategy.  But, in a way, I hope I'm wrong ;-)

>> > #3 is a dataflow problem, and not something you want to do every time
>> > you insert a call.

>> I'm not sure what you mean by "inserting calls".  We don't do that.

> Sure we do.
> We will definitely insert new calls when we PRE const/pure calls, or
> calls we determine to be movable to the point we want to move them

I think of that as moving, rather than inserting.  That said, I still
don't quite see what you're getting at.  Calls don't mess with gimple
registers of their callers, ever, so it appears to me that inserting a
call in the tree level is a NOP in terms of debug information
annotations.

> I'm not sure why you believe all the calls that we end up with in the
> IR are actually in the source (or even implied by it).

Conceptually, they are, kind-a sort of :-)  Except perhaps for
profiling calls, that are meant to be fully transparent anyway.
Others are more akin to inlining, or using a call for convenience
rather than expanding a copy or something to that effect.

>> But I'm not computing that in trees.  I'm just collecting and
>> maintaining data points for var-tracking, all the way from the tree
>> level.

> Okay, then for trees,  why bother tracking it when you can compute it
> right before translation with the same accuracy you can if you update
> it every time you make statement changes?

Just because we still haven't found a reliable way to do so that
doesn't drop essential information for correct debug info.  If we do,
I'll be delighted to immediately drop the proposed debug annotations
in the tree level.  And in the RTL level as well.

>> And debug information is not just about the values, it's about
>> mapping variables to values and locations.

> You have no locations at the tree level,

?!?  Locations as in point of execution, rather than DWARF locations,
is waht I mean.

> and i've explicitly said what
> i said applies to the tree level :)

Indeed ;-)

>> So, we can't infer all the
>> information we need.

> Again, i believe we can at the tree level.

Good, let's keep on it.  How about you use something like the example
above to explain how to accomplish it?

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]