This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168

--- Comment #3 from Gunther Piez <gpiez at web dot de> 2011-08-23 21:54:40 UTC ---
On 23.08.2011 19:58, jakub at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168
>
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |uros at gcc dot gnu.org
>
> --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-08-23 17:58:52 UTC ---
> Those aren't equivalent unfortunately, because bsf and bsr insns on x86 have
> undefined value if the source is zero.  While __builtin_c[lt]z* documentation
> says that the result is undefined in that case, I wonder if it would be fine
> even if long l = (int) __builtin_c[lt]z* (x); gave a value that wasn't actually
> sign-extended to 64 bits.
> The combiner already simplifies zero or sign extension of popcount/parity/ffs
> and, if ctz or clz value is defined at zero, also those, but if it is undefined
> it assumes anything in any of the bits and thus can't optimize the sign/zero
> extension away.  With -mbmi it will be optimized just fine, because for tzcnt
> (and lzcnt for -mlzcnt) insns are well defined even for source operand zero.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]