This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Number of 1's in 64 bit number...[don't read if not interestedin algos/math...]
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: Wei Qin <wqin at EE dot Princeton dot EDU>
- Cc: kelley dot r dot cook at gm dot com, Preeti Aira <ms_preeti at hotmail dot com>,<gcc at gcc dot gnu dot org>, <gcc-help at gcc dot gnu dot org>
- Date: Thu, 25 Apr 2002 13:51:23 -0400 (EDT)
- Subject: Re: Number of 1's in 64 bit number...[don't read if not interestedin algos/math...]
On Thu, 25 Apr 2002, Wei Qin wrote:
>
> First, I see only 32 bits being counted. The top 32 bits seems missing.
> Second, this is a very slow implementation. A faster one can be
> implemented through table lookup. It takes no masking, no branching.
But unless your compiler does array based optimizations (which gcc
doesn't, some do, some do and do a pretty darn good job, some don't), a
table lookup will be slower than a parallelizable/vectorizable masking
solution like:
#define TWO(c) (0x1u << (c)))
#define MASK(c) (((unsigned long)(-1)) / (TWO(TWO(c)) + 1u))
#define COUNT(x, c) ((x) & MASK(c)) + (((x) >> (TWO(c))) & MASK(c))))
int bitcount (long x)
{
int i;
for (i = 0; i < 5; i++)
x = COUNT (x, i);
return x;
}
The macros are just computing the neat masks you normally see here.
Though gcc doesn't seem to turn them into constants (intel's compiler
does) if you do it in a loop. It does if you just write out
x = COUNT (x, 0);
x = COUNT (x, 1);
....
Add one iteration/adjust data types approriately for 64 bit
integers.
You should be able to calculate how fast table lookup is versus the above
by simply counting cycles.
Assume, best case, that the lookup table is in L1 cache.
See if the number of cycles required to compute the above is < (the
L1 cache latency) * 8 + the cycles required for the moves and adds.
> /*need to generate the table yourself*/
> unsigned char mytable[256] = {0, 1, 1, 2, 1, ...};
> int count (long long x) {
>
> int ret;
> union {
> long long val;
> unsigned char s[8];
> } a;
>
> ret = mytable[a.s[0]];
> ret += mytable[a.s[1]];
> ....
> ret += mytable[a.s[7]];
>
> return ret;
> }
>
>
> On Thu, 25 Apr 2002 kelley.r.cook@gm.com wrote:
>
> > int count (long long x) {
> >
> > int result;
> >
> > result =
> > (x & 0x00000001 ? 1 : 0) +
> > (x & 0x00000002 ? 1 : 0) +
> > (x & 0x00000004 ? 1 : 0) +
> > (x & 0x00000008 ? 1 : 0) +
> > (x & 0x00000010 ? 1 : 0) +
> > (x & 0x00000020 ? 1 : 0) +
> > (x & 0x00000040 ? 1 : 0) +
> > (x & 0x00000080 ? 1 : 0) +
> > (x & 0x00000100 ? 1 : 0) +
> > (x & 0x00000200 ? 1 : 0) +
> > (x & 0x00000400 ? 1 : 0) +
> > (x & 0x00000800 ? 1 : 0) +
> > (x & 0x00001000 ? 1 : 0) +
> > (x & 0x00002000 ? 1 : 0) +
> > (x & 0x00004000 ? 1 : 0) +
> > (x & 0x00008000 ? 1 : 0) +
> > (x & 0x00010000 ? 1 : 0) +
> > (x & 0x00020000 ? 1 : 0) +
> > (x & 0x00040000 ? 1 : 0) +
> > (x & 0x00080000 ? 1 : 0) +
> > (x & 0x00100000 ? 1 : 0) +
> > (x & 0x00200000 ? 1 : 0) +
> > (x & 0x00400000 ? 1 : 0) +
> > (x & 0x00800000 ? 1 : 0) +
> > (x & 0x01000000 ? 1 : 0) +
> > (x & 0x02000000 ? 1 : 0) +
> > (x & 0x04000000 ? 1 : 0) +
> > (x & 0x08000000 ? 1 : 0) +
> > (x & 0x10000000 ? 1 : 0) +
> > (x & 0x20000000 ? 1 : 0) +
> > (x & 0x40000000 ? 1 : 0) +
> > (x & 0x80000000 ? 1 : 0);
> > return result;
> > }
> >
> >
>