This is the mail archive of the
mailing list for the GCC project.
Re: [patch][RFC] bitmaps as lists *or* trees
On Tue, Mar 5, 2013 at 5:16 PM, Steven Bosscher <firstname.lastname@example.org> wrote:
> On Tue, Mar 5, 2013 at 3:48 PM, Michael Matz wrote:
>> Iteration isn't easy on trees without a pointer to the parent (i.e.
>> enlarging each node), you need to remember variably sized context in the
>> iterator (e.g. the current stack of nodes).
> I was thinking of just making the "tree" a single forward-linked list,
> that's a valid splay tree and one that is compatible with the
> iterators, at least as long as the iterated bitmap are not modified
> (but I think that isn't allowed anyway?). Obviously the tree is not
> very well balanced after that, but splay trees have a way of
> re-balancing themselves quickly. The other possibility is to create
> vecs of bitmap_elements up front, but such iterators (fat iterators,
> as Richi just now called them in another mail :-) have the downside
> that you must always visit all bitmap elements, even if you want to be
> able to break the iteration before reaching the end.
>> I do like the idea of reusing the same internal data structure to
>> implement the tree. And I'm wondering about performance impact, I
>> wouldn't be surprised either way (i.e. that it brings about a large
>> improvement, or none at all), most bitmap membership tests in GCC are
>> surprisingly clustered so that the bitmaps cache of last accessed
>> element can work its magic (not all of them, as the testcase shows of
>> course :) ).
> I've retained the cached last accessed element. In fact it's now
> cached twice, because the root of the splay tree is always the last
> accessed element. I've considered *not* updating bitmap->current if
> bitmap_tree_find_element doesn't find the element it's looking for.
> That way, the last accessed element would be the element that was
> *really* last accessed, i.e. with a valid membership test, instead of
> the element closest to the last tested bit. Not sure if that'd be a
> good idea.
> Anyway, updated patch attached. It compiles all my cc1-i files, and
> it compiles the PR55135 test case, in 410s (I terminated cc1plus
> unpatched after >7200s, more than 2 hours...).
> An interesting question is, how can you identify bitmaps that could
> benefit from viewing it as a tree (at least for some time). I have no
> ideas here. Something with detailed memory stats, of course, but then
> how to derive some measure for a trade-off between list or tree view.
That's indeed a good question. The mem-stats help to print a useful
location for where the bitmap was allocated. But whether to switch it
or not is more fine-grained - any iteration / bitwise op should "reset"
the stats we collect. For the rest I'd simply count # of elements
walked for the find vs. # of splays done (where of course a splay is
more expensive than a linked list element walk). That "vs." of course
doesn't work because we don't have both views at the same time,
but counting the list walks vs. the number of searches would be a start
(doesn't help deciding whether a switch to a tree was bad and should
be undone, of course ... we'd have to compute the "distance" in
bitmap_elts between successive searches).
What about simply providing a bitmap_test_set_only () with no
way to switch back and let the tree balance itself? Does the
splay () overhead matter?