Bit twiddling builtins

Geoff Keating geoffk@geoffk.org
Tue Feb 4 00:12:00 GMT 2003


> Date: Mon, 3 Feb 2003 15:48:50 -0800
> From: Richard Henderson <rth@redhat.com>
> Cc: falk.hueffner@student.uni-tuebingen.de, gcc-patches@gcc.gnu.org

> On Mon, Feb 03, 2003 at 02:58:10PM -0800, Geoff Keating wrote:
> > I don't really care about the user program; I'm much more interested
> > in communicating with the middle-end of the compiler so it can apply
> > the above transformation itself and the user program doesn't have to know.
> 
> Ah.  Well in that case, I guess we could come up with something.
> I guess I agree that when we do constant folding we should wind
> up with something that matches what the target instruction would do.
> 
> I have no idea what you have in mind wrt ffs.  Since (x == 0) can
> be computed in parallel with ctz(x), I'm not sure why (ctz >> 5)
> is interesting at all.

This is an optimisation in itself; it has no particular relation to
ffs.  On ppc, the fastest way to implement

int zero_p (int x) {
  return x == 0;
}

is 
	cntlzw %r3,%r3
	srwi %r3,%r3,5
	blr

rather than the next best alternative using add-with-carry, which is
bad because anything touching the carry flag tends to synchronize
newer processors:

        li %r4,0
        subfic %r3,%r3,0
        addze %r3,%r4
        blr

or using branches (well, conditional return in this simple case):

	cmpwi cr0,%r3,0
	li %r3,1
	beqlr
	li %r3,0
	blr

(The last two vary in relative speed depending on the processor, but
the first is uniformly good.)

-- 
- Geoffrey Keating <geoffk@geoffk.org>



More information about the Gcc-patches mailing list