This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Code Generation


Colin McCormack <colin@field.medicine.adelaide.edu.au> writes:

> From where I stand, as a compiler user, a compiler's a big, complicated,
> sensitive, temperamental program.  If I can arrange for a simple
> post-process to do a job, I see no reason to make the compiler any
> bigger or more complex, to make others pay for the facility even though
> they may never need it, or to saddle others with the task of supporting
> the more complex code.

But for most cases, finding the proper places to annotate in the
assembly code would be the hard part, and that has to be done (and
maintained) in either case; emitting the right extra instructions is
usually relatively easy and often machine-independent.

Doing it through assembler macros means not only hacking the compiler
to indicate what gets instrumented, but also writing assembly code for
each processor configuration used, and hacking the build process to
use the right assembly file add-on.

> Decoupling what can be decoupled seems to make sense to me for this
> reason.  The minimally intrusive interface I can envisage is an
> assembler macro.  An assembler macro can be distributed as an add-on,
> it's small, it can be conditionally turned into an identity
> transformation, it can capture all the information the compiler has, and
> selectively use it.

"All the information"?  That will *not* be small.  Consider some of
the information you *might* want for *some* post-processor: user line
numbers; conditional branch or switch locations and destinations at
assembly level; conditional branch or switch locations and
destinations in HLL, including information necessary to distinguish
multiple paths that have been combined by the optimizer; memory reads,
writes, modifies, and block moves, possibly with operand types; basic
block boundaries; HLL block boundaries; variable types, sizes, and
locations (which may change in the course of the function); function
call sites with argument descriptions; incoming argument descriptions;
HLL types of arithmetic operands (e.g., for "see how many complex adds
we do"); beginning and end of code initializing an object.  Also
seemingly random HLL information like C++ vtbl pointer setting during
object initialization -- I don't think there's an equivalent for that
in any of the other languages in the distribution, unless Java has
something similar.  And most of that list is based on things we
already instrument under control of one option or another; for random
post-processing, I'm sure there'd be more.

I would guess "all the information" we might want is going to be
comparable in size to the actual instruction stream plus debugging
info, if not bigger.  If you're willing to analyze the instruction
stream and the debug info themselves, that gets rid of a lot of it,
but still leaves at least (a) HLL info like C++ constructors and
specifically vptr initialization, and types of non-user objects like
intermediate results, and (b) anything the user might want
instrumented but the optimizer has already discarded, which is still
too vague and probably rather large.

I think it's more pragmatic to just have the compiler simply know
about various sorts of instrumentation we want to do, and support them
directly.  If we find one construct tends to get instrumented in lots
of different ways, then perhaps that one should use some user-supplied
parameter do dictate the behavior.  Maybe by naming the assembly macro
to use or giving an option that translates to a file with additional
asm code to include to define the macros as desired; maybe by
describing the rtl or tree structures to generate (and optimize) for
the instrumentation; maybe something else.

> None of the applications I suggested are performance-critical.  Dumping
> the whole register set and reloading it wouldn't (to me) be an
> unacceptable overhead while running a Checker-like check of my code.

Maybe not for you, at the moment, but perhaps for someone else.  Using
Checker with a heavily used network server daemon program could slow
it down enough to cause timeouts, rendering it useless.  I've worked
on some daemons that run full-out and only barely keep up with their
traffic; a factor-of-2 (or 20) slowdown would be completely
unacceptable.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]