This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: egcs/gcc patch to speed up compiles of some Fortran code


>In this case,
>
>  * I would imagine that normally the lists _are_ quite short,
>    as evidenced by the fact we've not noticed the combinatorial
>    explosion before.

I don't know how short the lists are -- that doesn't necessarily
depend on whether combinatorial explosion is present.  Combinatorial
explosion requires a sufficiently long list of distinct SAVE_EXPR
nodes to happen (say, 100 or more), but it doesn't necessarily
happen for *any* expression that has that number of SAVE_EXPR nodes.

It is probably the case, though, that, for C code (maybe some other
languages as well), the lists are short, if not empty, because
the gcc front end rarely uses SAVE_EXPRs (last time I checked) while
building expression trees.

>  * We do not need to randomly index the list.  In fact, we only
>    only touch each entry twice -- once to create it, once to 
>    destroy it.

Right.  The creation behavior reads each node, and, for each SAVE_EXPR,
overwrites the code of each node and writes an element of a static
array.  The destruction behavior reads each element of that static
array and overwrites (after reading to verify, though) the code of
each node.

The order of the destruction behavior is unimportant.  It is
currently FIFO.  LIFO would be just fine, and maybe using the
TREE_CHAIN field and a "last" pointer instead of the array
approach (naturally leading to LIFO behavior) would be slightly
better for the cache.

If consensus is reached on this, let me know, I'll make some time
to redo the patch and submit it.  It might even turn out to be
"safe" to assume TREE_CHAIN of a SAVE_EXPR is already 0, in which
case overwriting the code can be avoided.  It'll require adding
commentary to tree.def to "reserve" TREE_CHAIN for this use (and
other leaf-like uses), however.

What worries me about using TREE_CHAIN is that someone, somewhere,
uses it for SAVE_EXPR but failed to add the appropriate commentary
to tree.def.

Though, looking at tree.h, it seems that TREE_CHAIN *used* to be used
to chain expression nodes together to form the list of imperative
statements for, e.g., a C excerpt like "a = b + c; foo(); ++i;".  In
that case, while gcc might not generate any SAVE_EXPR nodes wrapping
around an entire expression in such a statement list, other languages
might well do that, and might chain such expressions together via
TREE_CHAIN.

If "we" decide this scenario is unlikely, but we want to protect
against it, we can put an appropriate call to abort() in
expand_expr_stmt.  But that certainly wouldn't catch all possible
cases of "dirty" TREE_CHAIN fields in SAVE_EXPR nodes resulting
from a particular front end's "stealing" of those fields for its
own uses.  (g77 doesn't do this, but is one of the most "vanilla"
front ends to gcc, AFAIK.)


Maybe all the above is possible and reasonable, but perhaps now y'all
can see the advantages of my current approach: it's guaranteed to
work, pretty much no matter what other undocumented stuff might
be going on in the compiler.  The only noticeable disadvantage is
that someone debugging the code and trying to print a tree while
the compiler is in the midst of a call to safe_from_p might see
ERROR_MARK where he should see SAVE_EXPR.  A similar problem happens
if we use the guaranteed-to-be-zero third operand (RTX) instead of
the code field, I would think.  Only maintaining *and searching* a
completely separate array of nodes found so far, thus not having to
rewrite the nodes themselves at all, avoids these problems, at the cost
of additional time to search that new array for each SAVE_EXPR to see
if we've already seen it -- on yesterday's and today's RISC machines,
at least, this would likely be only a bit more expensive compared to
the current approach.  So that's a possibility as well, probably the
"cleanest", though likely not the fastest, one I can think of.

In summary, I don't know the ideal answer.  I don't feel my current
approach is worse, overall, than any other approach, and I believe
it is "guaranteed" to not collide with other "typical" undocumented
goings-on in a compiler project this big.  But I'm willing to
rewrite it using any approach people reasonably agree on, as long
as they've taken into account the relevant risks.

The incentive, of course, is that the sooner *some* version of this
patch goes in, the faster the test suite runs, now that a "slow"
test has been added to gcc/testsuite/g77.f-torture/compile.  :)

        tq vm, (burley)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]