This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] New SSA variable mapping infrastructure


On Nov  8, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote:

> On Nov 8, 2007 6:25 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Nov  8, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote:
>> 
>> >> And what happened to l?
>> 
>> > l is no longer a value that is computed (we do not compute D.1545 +
>> > j, but only D.1545 + D.1545 and the final sum.  This is what I mean
>> > with "preserving values" - values that are no longer computed do not
>> > have their values retained)
>> 
>> So, you intentionally let it go, even though debug information could
>> encode it.

> No, the optimizers chose to.

And it also chose (on its own, without any influence whatsoever from
the GCC developers ;-) to not keep the information around that would
be needed to add this piece of info to the compiler output, in the
debug info sections.

>> And, worse, you leave no note behind that the value of l
>> is no longer known at that point, so if l is retained elsewhere, debug
>> information will be emitted for it, but it will point at a location
>> that does not hold the value of l at all in the region where it was
>> optimized away.  That's bad.

> I have to think about this - as we don't yet generate debug
> information out of this
> I cannot verify if this is really true.

You don't mark the point at which l goes away.  The debug info
generator can't count on magic to find that out where it should emit
the label for the end of the range where 'l' is available.  The
information must be there somewhere.

>> > Now onto the above case.  What we end up with after tree optimizations in
>> > the moment is
>> 
>> Looks like you threw away the function calls.  That's cheating! :-)

> Oh - I thought it was breakpoints, that is, what you'd debug, not
> function calls.

I wrote it was functions on which you'd set breakpoints in a debugger,
go up one frame and print the value of the variable.

>> I don't care that it's live or that it's computed.  The question is
>> whether it makes any sense whatsoever to indicate, in debug
>> information, that the value of k in bb4 is in k.5, rather than zero.
>> That's an obvious bug to me.

> I don't see what you do different here (but I also don't see what is
> your complaint - I'll have to think about it).

What I do different is that bb4 succeeds bb2, and in bb2 there's a
note stating that user variable k holds the value 0, and this will
make to debug information.

>> > I believe you cannot do better here unless you limit optimization.
>> 
>> > With VTA I see (final_cleanup again):
>> 
>> The VTA branch is work in progress.  We're discussing design, not
>> incomplete implementations.  It's just silly to point at raw food and
>> say "hey, your cooking recipe sucks!" ;-)

> Nor is our approach a complete implementation ;)  But of course the easies
> judgement of a concept is to look at what the (current state of) code does.

Except when the code is not there for you to look at.

The fact that there is a branch with some code doesn't mean the design
is implemented in there.  Only a small portion of the design is
implemented there.  The most important piece is still missing, and
that's the bit that uses the notes, carried and updated throughout
compilation, to determine where each user variable is located at each
point in the program.  I'm talking about the dataflow-globalized cse
analysis in var-tracking that I mentioned in my initial writeup of
what the patch was about: bullet 3 in
http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html

None of this is implemented in the branch, but it's a necessary part
to get correct debug information.

> I see.  As we only track names at the point of the definition (that
> is, the point the get "live")

even if they get live for a completely different set of variables, and
the value might not even be assigned to the variable of interest but
your debug information will make it seem like it does...

> but not the point they die (I don't see you do this

That's what the cse is going to accomplish.  It's going to use the
debug notes as fixed points, and propagate information up and down
from them about alternate locations for a variable.  Such that, for
example, if you have, before var-tracking:


(set (reg 1) (mem (reg 2)))

(set (reg 5) (reg 1))

(set (reg 3) (reg 1))

(var_location "x" (reg 3))

(set (reg 1) (mem (reg 4)))

(set (mem (reg 4)) (reg 3))

(set (mem (reg 2)) (reg 1))

(set (reg 3) (mem/v (reg 6)))

(var_location "x" (reg 3))

(use (reg 3))

(clobber (reg 3))


we know that at the point of the var_location debug_insn, reg 3 holds
the value of x, and that x is assigned to that value at that point.

But since we've done cse, we know at that point that the value of x,
from the point of assignment on, is also available in (reg 1), in (mem
(reg 2)) and in (reg 5).

But when (reg 1) is modified, it no longer holds the value of x.

Then, since (reg 3) is copied to (mem (reg 4)), we know x is available
there as well.

And when (mem (reg 2)) is modified, we know it no longer contains the
value of x.  Now only (reg 3) holds the value of x.

But then, when (reg 3) is modified, we know it no longer holds the
value of x either, so the old value of x is only available in (reg 5)
and (mem (reg 4)).

But then, the var_location debug_insn right after that says that (reg
3) does indeed contain the value of "x", so we forget all other
locations of x, and now (reg 3) is it (the volatile mem isn't a usable
location, for it's volatile).

The use there is what keeps (reg 3) alive such that it can be
referenced in the var_location expression.  If it wasn't there, x
would have been regarded as optimized away at that point, and a dummy
expression in the var_location would reflect this.

The clobber says that (reg 3) is modified in unpredictable ways
removes (reg 3) from the set of available locations for x.  None
remain, so after the clobber we'd mark x as optimized away.


> - but you obviously have redundant "becoming live" points, like
> DEBUG k = k (?)),

There's nothing redundant in here.  It's a link between two separate
namespaces: k in the user level source representation, and an
arbitrary variable that the compiler chose to also name k.  In my
design, there's no implied correspondence between gimple register
variables and user variables.  Such implementation variable names are
completely arbitrary, and they're useful only for compiler dumps.  All
debug information for variables that qualify as gimple registers is
generated based on debug insns.  Addressable variables get different
treatment, for they do have a fixed location, so we don't have to work
so hard to generate correct debug information for them.

> we hope (or believe...)  that var-tracking will do the lifetime
> analysis that avoids what you call "wrong debug information".

It can't unless you somehow tell it that a certain location, after
modification, no longer reflects the value of a variable.  But your
model doesn't seem to make room for this.

> It would be nice to have a gdb testcase testing for not having this
> wrong debug information - at least that would make it easier to
> understand your point.

Consider this:

int foo(int i, int j) {
  int l;
  int k = 0;

  l = function_that_returns_5 ();
  breakpoint0 ();
  asm ("" : : "X" (l)); /* ensure it's live */

  l = j + i * 10;

  if (i < j) {
    k = i * 10;
    breakpoint1();
  } else
    breakpoint2();
  }

  breakpoint3();

  return l + k;
}

Set a breakpoint in all 3 breakpoint functions and, when you reach
them, go up one frame and print k and l.

With your design, the optimizations that completely drop the second
assignment to l, leaving nothing behind to mark the death of the
previous value, debug information will likely indicate that variable l
still holds value 5 through to the end of the function, unless the
location holding it is reused for some other purpose, the only case in
which current var-tracking would realize l is no longer available
there.

> So, the thing you have as RHS of # DEBUG name = is an arbitrary expression?

Yup.  Pretty much anything.  Whether it's representable in dwarf or
not is something for the debug info back end to figure out.  If it's
not, it's perfectly legitimate for it to say "I can't represent the
location of this variable".

>> # DEBUG l = j + T.1;
>> # DEBUG k = j + i;

> Ok, so I see GIMPLE expressions.  Are they required to be gimple?

Nope.  Anything, really.  Requiring them to be gimple would limit the
ability to express complex value computations, necessary after various
optimizations, without any actual benefit.

>> # DEBUG k = k;

> What's this?  Why does it say k = k?

See above.  It means the implementation gimple register k now holds
the value of the user variable k.

> It's just that "k = k" looks like redundant information.

That's just because them appear to be the same k.  Maybe using '=' was
misleading.  Would it help if I changed it to '=>' or 'is at' or some
such?

> Of course k is the same as k(?)

Once you start optimizing the program, any expectations that gimple
registers match user variables with any precision are just feeble
hopes :-)

That's the whole point of the debug stmts: to provide strong
statements about the value a user variable is expected to hold at that
point in the program, such that we can mess however much we want with
implementation variables, without any fear of hampering debuggability.
Unless we mess with the debug notes themselves, that is.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]