This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Updated patch (was Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed)
- From: David Malcolm <dmalcolm at redhat dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 05 Aug 2013 13:28:32 -0400
- Subject: Re: Updated patch (was Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed)
- References: <1374851081-32153-1-git-send-email-dmalcolm at redhat dot com> <1374851081-32153-12-git-send-email-dmalcolm at redhat dot com> <51FAD6FC dot 5060309 at redhat dot com> <1375470484 dot 4994 dot 69 dot camel at surprise> <51FC0FFD dot 8030901 at redhat dot com> <1375490907 dot 4994 dot 87 dot camel at surprise> <51FD4E6B dot 3090309 at redhat dot com> <1375715902 dot 4994 dot 129 dot camel at surprise> <51FFD9E4 dot 5040704 at redhat dot com>
On Mon, 2013-08-05 at 06:59 -1000, Richard Henderson wrote:
> On 08/05/2013 05:18 AM, David Malcolm wrote:
> > So I *think* the most efficient traversal is to do this first (with a
> > suitable comment):
> >
> > for (int i = passes_by_id_size ; i > 0; )
> > ::gt_ggc_mx (passes_by_id[--i]);
> >
> > That ought to visit all of the passes without triggering recursion
> > (unless someone does something weird to the pass structure in a plugin).
>
> Reasonable.
>
> > I'm nervous about omitting the traversal for other pointers the class
> > has (though I *think* the passes by id array captures it all) so for
> > completeness I'd prefer to then to also do it for all of:
> >
> > ::gt_ggc_mx (all_passes);
> > ::gt_ggc_mx (all_small_ipa_passes);
> > ::gt_ggc_mx (all_lowering_passes);
> > ::gt_ggc_mx (all_regular_ipa_passes);
> > ::gt_ggc_mx (all_lto_gen_passes);
> > ::gt_ggc_mx (all_late_ipa_passes);
> >
> > #define INSERT_PASSES_AFTER(PASS)
> > #define PUSH_INSERT_PASSES_WITHIN(PASS)
> > #define POP_INSERT_PASSES()
> > #define NEXT_PASS(PASS, NUM) ::gt_ggc_mx (PASS ## _ ## NUM);
> > #define TERMINATE_PASS_LIST()
> >
> > #include "pass-instances.def"
> >
> > #undef INSERT_PASSES_AFTER
> > #undef PUSH_INSERT_PASSES_WITHIN
> > #undef POP_INSERT_PASSES
> > #undef NEXT_PASS
> > #undef TERMINATE_PASS_LIST
> >
> > Is it standard in handwritten GC code to omit traversals based on this
> > higher-level knowledge? Presumably it warrants a source comment?
> > (having spent too much time lately debugging PCH issues I'm nervous
> > about trying to optimize this).
>
> It's something that we should think about when doing any kind of GC.
> The deep recursion has bitten us before, and lead to the chain_next
> and chain_prev annotations for GTY.
Note to self: the chain_next and chain_prev options are documented at:
http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html
BTW, this issue seems very reminiscent to me of an implementation detail
within the CPython interpreter called the "trashcan". CPython uses
reference-counting, and if you have a deep chain of references e.g.:
# make a very deep chain of list-of-lists:
foo = []
for i in range(depth):
foo = [foo]
# at this point we have
# foo = [[[[[[[[[[[[[[[[[[[[[[[[[[[[ \
# ]]]]]]]]]]]]]]]]]]]]]]]]]]]] for some depth
# try to overflow the stack during deallocation:
del foo
...you would have a deeply nested chain of decrefs, each triggering
deleting of the object, and hence a potential stack overflow.
The way CPython avoids such deep reference chains from killing the
process with a stack overflow is a pair of macros used in the
deallocator for container types, which track the depth of the traversal,
and when it reaches a certain threshold, start appending objects to a
singly-linked list of deferred deallocations (using a spare pointer in
the objects that's safe to use during finalization). Hence the
recursion becomes an iteration when a certain stack depth is reached,
and diabolical reference graphs can't blow the stack (modulo bugs in
3rd-party container code, of course).
The equivalent for GC would be for mx routines to change from:
if (ggc_test_and_set_mark (p))
gt_ggc_mx (p);
to:
if (ggc_test_and_set_mark (p))
add_to_worklist (p, marking_cb);
I suspect that this technique can't be used in GGC since it would
require (a) having a spare pointer per-object and (b) tracking the type
of the marking callback, which would be a jump through a function ptr
where there wasn't one before, and thus unacceptable in the general
case.
> > AIUI we could do similar optimizations in the PCH object-noting hook,
> > but can't do similar optimizations in the PCH *op* hook, since every
> > field needs to reconstructed when reading the data back from disk.
>
> Correct.
I'll update the patch with the optimizations you suggest (and the
reverse-order marking I mentioned above).
Thanks
Dave