This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/36041] Speed up builtin_popcountll
- From: "jsalavert at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 05 Sep 2012 10:39:45 +0000
- Subject: [Bug middle-end/36041] Speed up builtin_popcountll
- Auto-submitted: auto-generated
- References: <bug-36041-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041
Josà Salavert Torres <jsalavert at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jsalavert at gmail dot com
--- Comment #8 from Josà Salavert Torres <jsalavert at gmail dot com> 2012-09-05 10:39:45 UTC ---
Hello, there has been any advance in in this issue, Knuth's publication
approach would be great for 8 bit registers also.
Also, allowing different behaviour for each architecture would be better.
In the forums the implementation described here is now like this, seems to use
less operations:
inline unsigned int bitcount32(uint32_t i) {
//Parallel binary bit add
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0xF0F0F0F) * 0x1010101) >> 24;
}
//Parallel binary bit add
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
}