Designs for better debug info in GCC

Daniel Berlin dberlin@dberlin.org
Tue Dec 18 23:35:00 GMT 2007


>
> It is desirable to be able to represent constants and other
> optimized-away values, rather than stating variables have values they
> can no longer have:
>
> int
> x1 (int x)
> {
>   int i;
>
>   i = 2;
>   f(i);
>   i = x;
>   h();
>   i = 7;
>   g(i);
> }
>
> Even if variable i is completely optimized away, a debugger can still
> print the correct values for i if we keep annotations such as:

>
>   (debug (var_location i (const_int 2)))
>   (set (reg arg0) (const_int 2))
>   (call (mem (symbol_ref f)))
>   (debug (var_location i unknown))
>   (call (mem (symbol_ref h)))
>   (debug (var_location i (const_int 7)))
>   (set (reg arg0) (const_int 7))
>   (call (mem (symbol_ref g)))
>
> In this case, before the call to h, not only the assignment to i was
> dead, but also the value of the incoming argument x had already been
> clobbered.  If i had been assigned to another constant instead, debug
> information could easily represent this.
>
> Another example that covers PHI nodes and conditionals:
>
> int
> x2 (int x, int y, int z)
> {
>   int c = z;
>   whatever0(c);
>   c = x;
>   whatever1();
>   if (some_condition)
>     {
>       whatever2();
>       c = y;
>       whatever3();
>     }
>   whatever4(c);
> }
>
> With SSA infrastructure, this program can be optimized to:
>
> int
> x2 (int x, int y, int z)
> {
>   int c;
>   # bb 1
>   whatever0(z_0(D));
>   whatever1();
>   if (some_condition)
>     {
>       # bb 2
>       whatever2();
>       whatever3();
>     }
>   # bb 3
>   # c_1 = PHI <x_2(D)(1), y_3(D)(2)>;
>   whatever4(c_1);
> }
>
> Note how, without debug annotations, c is only initialized just before
> the call to whatever4.  At all other points, the value of c would be
> unavailable to the debugger, possibly even wrong.
>
> If we were to annotate the SSA definitions forward-propagated into c
> versions as applying to c, we'd end up with all of x_2, y_3 and z_0

I> f you forward propagate any annotations, ever,
> applied to c throughout the entire function, in the absence of
> additional markers.
>
> Now, with the annotations proposed in this paper, what is initially:
>
> int
> x2 (int x, int y, int z)
> {
>   int c;
>   # bb 1
>   c_4 = z_0(D);
>  # DEBUG c z_0(D)
> whatever0(z_0(D));
> # DEBUG c x_2(D)
> whatever1();

> and then, at every one of the inspection points, we get the correct
> value for variable c.
Because you have added information you have no way of knowing.
How exactly did you compute that the call *definitely sets c to the
value of z_0*, and definitely sets the value of c to x_2.

This must be "may-information", because we don't know what the call does.

Ignoring this (the solution is to not assume anything at calls,
because you run the risk of gettng the wrong answer at meet points
later on!) your scheme is sufficient to get correct values, but not
correct locations.

However, value equivalene does not imply location equivalence, and all
of our debug formats deal with locations of variables, except for
constants.

IE If you translate this directly into DWARF3, as written, you will
claim that c and x_4 has the same location (since dwarf does not let
you say "it has the same value as x, but not the same location), and
thus incorrectly represent that p *x_4=5 modifies c if i were to do it
in the debugger.  Because of the may-problem, you will also claim the
same value/location for c and x_2, which you can't prove is right,
because you don't know what whatever1/2 actually does.

if all you want is the values you compute above, on SSA, you can
easily use a lattice to compute the same values you are going to
compute as you update the annotations on the fly.

(This is because it is a flow sensitive problem, and you want the flow
answers at each unique definition point, which SSA neatly provides,
except for calls, where you could hang it off the vops).

Tracking which values *definitely represent user values* is actually
quite easy at the tree level, and doesn't require any IR modification.

It may be worth doing at the RTL level, however, where the solution
requires making up program points at each definition site and
computing the dataflow problem in terms of them.
--Dan



More information about the Gcc-patches mailing list