This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Designs for better debug info in GCC

From: Alexandre Oliva <aoliva at redhat dot com>
To: Ian Lance Taylor <iant at google dot com>
Cc: "Richard Guenther" <richard dot guenther at gmail dot com>, gcc-patches at gcc dot gnu dot org, gcc at gcc dot gnu dot org
Date: Wed, 07 Nov 2007 17:11:25 -0200
Subject: Re: Designs for better debug info in GCC
References: <or4pg114h5.fsf@oliva.athome.lsd.ic.unicamp.br> <84fc9c000711050327x74845c78ya18a3329fcf9e4d2@mail.gmail.com> <or7ikuv6ex.fsf_-_@oliva.athome.lsd.ic.unicamp.br> <m3d4umrq04.fsf@localhost.localdomain>

On Nov  7, 2007, Ian Lance Taylor <iant@google.com> wrote:

> Alexandre Oliva <aoliva@redhat.com> writes:
>> I've pondered both alternatives, and decided that the latter was the
>> only testable path.  If we had a reliable debug information tester, we
>> could proceed incrementally with the first alternative; it might be
>> viable, but I don't really see that it would make things any simpler.

> It seems to me that this is a reason to write a reliable debug
> information tester.

Yep.  This is in the roadmap.  But it's not something that can be done
with GCC alone.  It's more of a "system" test, that will involve
debuggers or monitoring tools.  gdb, fryks, systemtap or some such
come to mind.

> Your approach gives you a point solution--did anything change
> today--but it doesn't give us a maintenance solution--did anything
> change over time?

Actually, no, your assessment is incorrect.  What I'm providing gives
us means to test, at any point in time, that enabling debug
information won't cause changes to the generated code.  So far, code
in the trunk only performs these comparisons within the GCC directory.
And, nevertheless, patches that correct obvious divergences have been
lingering for months.

I have recently-posted patches that introduce means to test other host
and target libraries.  I still haven't written testsuite code to
enable us to verify that debug information doesn't affect the
generated code for existing tests, or for additional tests introduced
for this very purpose, but this is in the roadmap.

Of course, none of this guarantees that debug information is accurate
or complete, it just helps ensure that -g won't change code
generation.

Testing more than this requires a tool that can not only interpret
debug information, but also the generated code, and verify that they
match.  The plan is to use the actual processors (or simulators) to
understand the generated code, and existing debug info consumers that
are debugging or monitoring tools to verify that debug info reflects
the behavior observed by the processor.

> While I understand that you were given certain requirements, for the
> purposes of mainline gcc we need to weigh costs and benefits.  How
> many of our users are looking for precise debugging of optimized code,
> and how much are they willing to pay for that?  Will our users overall
> be better served by the 90% solution?

Does it really matter?  Do we compromise standards compliance (and so
violently, while at that) in any aspect of the compiler?

What do we tell the growing number of users who don't regard debug
information as something useless except for occasional debugging?
That GCC cares about standards compliant except for debug information,
and they should write their own Free Software compiler if they want a
correct, standards-compliant compiler?

Do we accept taking shortcuts for optimizations or other code
generation issues when they cause incorrect code to be produced?  Why
should the mantra "must not sacrifice correctness" not applicable to
debug information standards in GCC?

At this point, debug information is so bad that it's a shame that most
builds are done with -O2 -g: we're just wasting CPU cycles and disk
space, contributing to accelerate the thermodynamic end of the
universe (nevermind the Kyoto protocol ;-), for information that is
severely incomplete at best, and terribly broken at worst.

Yes, generating correct code may take some more memory and some more
CPU cycles.  Have we ever made a decision to use less memory or CPU
cycles when the result is incorrect code?  Why should standardized
meta-information about the generated code be any different?

>> 1. every single gimple assignment grows by one word,

I take this back, I'd been misled by richi's description.  It's really
a side hashtable (which gets me worried about the re-emitted rather
than modified gimple assignments in some locations), so it doesn't
waste memory for gimple assignments that don't refer to user
variables.

Unfortunately, this is not the case for rtx SETs, in this alternate
approach.

> I don't know what the best approach is for improving debug
> information.

Your phrasing seems to indicate you're not concerned about fixing
debug information, but rather only about making it less broken.  With
different goals, we can come to very different solutions.

> But I think we've learned over time that explicit NOTEs
> in the RTL was not, in general, a good idea.  They complicate
> optimizations and they tend to get left behind when moving code.

Being left behind is actually a feature.  It's one of the reasons why
I chose this representation.  The debug annotation is not supposed to
move along with the SET, because it would then no longer model the
source code, it would rather be mangled, often beyond recognition,
because of implementation details.

As for complicating optimizations, I can have some sympathy for that.
Sure, generating code without preserving the information needed to map
source-level concepts to implementation-level concepts is easier.  But
generating broken code is not an option, it's a bug, so why should it
be an acceptable option just because the code we're talking about is
meta-information about the executable code?

> We've fixed many many bugs and misoptimizations over the years due to
> NOTEs.  I'm concerned that adding DEBUG_INSN in RTL repeats a mistake
> we've made in the past.

That's a valid concern.  However, per this reasoning, we might as well
push every operand in our IL to separate representations, because
there have been so many bugs and misoptimizations over the years,
especially when the representation didn't make transformations
trivially correct.

However, the beauty of the representation I've chosen, that models the
annotations as a weak USE of an expression that evaluates to the value
of the variable at the point of assignment, most compiler passes
*will* keep them accurate, where any other representation would have
to be dealt with explicitly.  Sure, some passes need to compensate to
make sure these weak USEs don't affect codegen or optimizations, and a
few need special tweaks to keep notes accurate, to stop the safeguards
in place that would discard the information that went inaccurate.  But
these are few.  I believe strongly that this is the correct trade-off.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

Follow-Ups:
- Re: Designs for better debug info in GCC
  - From: Ian Lance Taylor

References:
- [vta] don't let debug insns get in the way of simple vect reduction
  - From: Alexandre Oliva
- Re: [vta] don't let debug insns get in the way of simple vect reduction
  - From: Richard Guenther
- Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction)
  - From: Alexandre Oliva
- Re: Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction)
  - From: Ian Lance Taylor

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]