This is the mail archive of the
mailing list for the GCC project.
Re: [patch] use __builtin_ctzl in ggc-page
On Jan 17, 2005 01:38 PM, Richard Earnshaw <firstname.lastname@example.org> wrote:
> On Fri, 2005-01-14 at 18:32, Steven Bosscher wrote:
> > Hi,
> > The page allocator looks for free objects by keeping bitmaps of
> > taken slots in a bag. We look for a 0 bit with a loop right now,
> > but we could use a builtin instead, like Nathan did recently for
> > the revamped bitmap.c.
> > Bootstrapped on x86_64-suse-linux-gnu, OK?
> > Gr.
> > Steven
> > * ggc-page.c (ggc_alloc_stat): Use __builtin_ctzl instead of a
> > loop to look for a free slot in a page entry.
> I would expect this makes things worse for platforms that don't have the
> builtin, since we now end up with a library call overhead on top of the
> calculation cost.
Then we can implement an inline version of the builtin in
builtins.c. We use this particular builtin in bitmap.c too,
so maybe if your favorite target does not have the builtin,
adding it to builtins.c is a good idea anyway. Note that
this builtin can be implemented with a simple binary chop
instead of a loop, so it would give you more efficient code
too, even if you don't have a ctz instruction. (An example
of that binary chop is in bitmap.c too, btw.)
> I'm not convinced these micro-optimizations are such a good thing.
> Apart from anything else, they clutter the code unnecessarily
Well, I think we should use all the builtins we can where it
makes sense. It gives you testing of your own builtins and it
allows you to get the best code for each target. In this case
it gives a small (but measurable) speedup for ppc at -O0 for me.
> -- and a
> really smart compiler ought to be able to work this one out!
The compiler wouldn't have to be really smart, it would only
need to prove that the loop terminates, and then recognize the
idiom. If you look at the loop for this particular case, you
can easily prove that "~entry->in_use_p[word]" is non-null,
and since bit == 0 on entry to the "... >> bit" loop, the loop
must terminate. Then all you have to do is somehow recognize
this idiom as a ctz.
But I am quite sure that it will be hard to justify the cost
of recognizing this kind of uncommon idiom.