This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/36041] New: Speed up builtin_popcountll


The current __builtin_popcountll (and likely __builtin_popcount) are fairly
slow as compared to a simple, short C version derived from what can be found in
Knuth's recent publications.  The following short function is about 3x as fast
as the __builtin version, which runs counter to the idea that __builtin_XXX
provides access to implementations that are exemplars for a given platform.

unsigned int popcount64(unsigned long long x)
{
    x = (x & 0x5555555555555555ULL) + ((x >> 1) & 0x5555555555555555ULL);
    x = (x & 0x3333333333333333ULL) + ((x >> 2) & 0x3333333333333333ULL);
    x = (x & 0x0F0F0F0F0F0F0F0FULL) + ((x >> 4) & 0x0F0F0F0F0F0F0F0FULL);
    return (x * 0x0101010101010101ULL) >> 56;
}

This version has the additional benefit that it omits the lookup table that the
current "builtin" version uses.

I measured the above function vs. __builtin_popcountll with a loop like the
following:

    t1 = clock();
    for (j = 0; j < 1000000; j++)
        for (i = 0; i < 1024; i++)
            pt = popcount64(data[i]);
    t2 = clock();

    printf("popcount64 = %d clocks\n", t2 - t1);

...where data[] is a u64 that's preinitialized.

I'll attach the exact source I used, which also includes two other possible
implementations of popcountll.


-- 
           Summary: Speed up builtin_popcountll
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: intvnut at gmail dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]