This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Target FUNCTION_{PRO,EPI}LOGUE question


Joseph S. Myers wrote:-


> > OK.  But why not represent all of type + functions by just a vtable
> > pointer?  It cuts out a lot of if statements, if nothing else.
> 
> Could you explain in more detail what you're proposing, for both the 
> language-independent and the language-dependent parts of GCC?

Not really, it's just in my head :-) It's a general thing, not just
for integers.  Currently our "tree" shares a common part, and then
various data depending upon what it is.  Different tree types contain
a lot of information that they don't use, simply because it's
convenient that everything share a common structure.  This has the
problem that many things are misnamed.  For example, from tree.h:-

   public_flag:

       TREE_OVERFLOW in
           INTEGER_CST, REAL_CST, COMPLEX_CST
       TREE_PUBLIC in
           VAR_DECL or FUNCTION_DECL or IDENTIFIER_NODE
       TREE_VIA_PUBLIC in
           TREE_LIST or TREE_VEC
       EXPR_WFL_EMIT_LINE_NOTE in
           EXPR_WITH_FILE_LOCATION

I think it would be nice if this "public" flag were named "overflow"
for INTEGER_CST, REAL_CST and COMPLEX_CST, and named "public" for
VAR_DECL etc. etc.  In other words, named what it really is.

Why isn't it at present?  Because they all share the same base
structure.  It's also a waste of space - as Zack has pointed out many
times, IDENTIFIER_NODEs care for almost nothing in their tree_common
part, which bloats them and quite possibly pushes them into the next
GC allocation size.

Another consequence is we get enormous switch statements switching on
a tree's TREE_CODE or TREE_CLASS.  I think most of those switch
statements should be replaced by virtual function calls.

So I envisage a different way of doing things.  Whether it's an
improvement or not, or even possible, is debatable, but I think it
might be in some form.  If our "tree" became more like a C++ class
hierarchy, with different types deriving from each other, then each
type could carry around just the baggage it needs.

Each tree type would have a vtable pointer, possibly some flags
(though not necessarily the same ones - i.e. not in the base class),
and maybe a byte-sized enum representing "type" (though this is
somewhat redundant since we have a vtable ptr).

One immediate and large beneficiary of this would be (tree) garbage
collection.  This is currently quite inefficient - we push trees onto
a hand-coded stack, and then pop them off later to garbage collect
their children, recursively.  We also do a lot of redundant NULL_TREE
checking, since we can't in general know whether something's a
NULL_TREE or not: see ggc_test_and_set_mark().  Each child is garbage
collected through a switch statement on the TREE_CODE, with various
exceptional cases.

Wouldn't something like something like this be much nicer and more
efficient:-

   for (each root in all_ggc_roots)
	root->vtable->gc_collect ();

where gc_collect () is a common function across all trees (i.e. in C++
terms a pure virtual function of the abstract base class).  These
functions would then call gc_collect () on all their children, etc.,
recursively.

Since each function is type-specific, it would know whether various
children of that type can potentially be NULL_TREEs or not.  And we
don't need an inefficent hand-coded stack - we're implicitly using the
processor's own.  It's clean and simple.

More importantly, it's extensible - each time we add a new type of
tree, we don't need to update code in ggc-common.c.  At present if we
add any kind of new tree, we have to check code literally everywhere
for in-built assumptions about what values of TREE_CODE are possible,
what the default: case does in the switch statement, etc.  It's a
nightmare.

Anyway, the above is just a rough idea, a lot more would need to be
fleshed out.

> The implicit casts would continue to be provided by the front ends.  
> You'd need to make sure that same-size implied same representation -
> including, say, that a constant 0 long long was represented in the same
> way as a larger constant long long, when adding long longs (even if a
> smaller representation might otherwise be prefered for a constant 0).

The way I see it is that's impossible - if you're adding two things
they're the same type (or representation at least).  I know we don't
necessarily do things this way now.  I haven't thought about it in
great detail, either.

> It would also be embedded in RTL.  It would also be embedded everywhere
> else in the compiler that uses a HOST_WIDE_INT variable or an array of 2
> of them to represent a target integer.  Everywhere that uses target
> integers temporarily, without needing to construct a tree around them,
> might need to do allocation rather than just declaring a variable.

Right.  I'd not thought about what happens below tree level.

Neil.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]