This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Simplify rules for ctz/clz patterns and RTL


I suppose you're using (assuming 32-bit)
    ctz(x) := 31 - clz(x & -x)
now, which gives -1 for 0; and the version you're looking for is
    ctz(x) := 32 - clz(~x & (x-1))
which gives 32 for 0.

Thanks! That's, unfortunately, one more instruction, although I guess a lot of chips have "a & ~b" as one operation.

Yes, it's exactly the same cost on PowerPC, and on most other RISC architectures.

It looks like ~x & (x-1) turns any number into 000...111... where the boundary between zeroes and ones lies at the lowest 1 in the original.

Exactly. "To the right of the lowest 1".


Is popcount really slow on PowerPC? (Compared to clz?) Ideally one would choose between the two expansions based on RTL costs, but the only architectures it matters for are i386 and powerpc, and neither of them define the cost of either clz or popcount.

Andrew answered this already. Adding clz/popcount to the cost tables seems like a good idea, yes.


Segher



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]