This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64
- From: "gpiez at web dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 23 Aug 2011 21:54:40 +0000
- Subject: [Bug c/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64
- Auto-submitted: auto-generated
- References: <bug-50168-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168
--- Comment #3 from Gunther Piez <gpiez at web dot de> 2011-08-23 21:54:40 UTC ---
On 23.08.2011 19:58, jakub at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168
>
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |uros at gcc dot gnu.org
>
> --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-08-23 17:58:52 UTC ---
> Those aren't equivalent unfortunately, because bsf and bsr insns on x86 have
> undefined value if the source is zero. While __builtin_c[lt]z* documentation
> says that the result is undefined in that case, I wonder if it would be fine
> even if long l = (int) __builtin_c[lt]z* (x); gave a value that wasn't actually
> sign-extended to 64 bits.
> The combiner already simplifies zero or sign extension of popcount/parity/ffs
> and, if ctz or clz value is defined at zero, also those, but if it is undefined
> it assumes anything in any of the bits and thus can't optimize the sign/zero
> extension away. With -mbmi it will be optimized just fine, because for tzcnt
> (and lzcnt for -mlzcnt) insns are well defined even for source operand zero.
>