[Bug c/108140] New: tzcnt gives different result in debug vs release

levo.delellis at gmail dot com gcc-bugzilla@gcc.gnu.org
Fri Dec 16 08:18:00 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108140

            Bug ID: 108140
           Summary: tzcnt gives different result in debug vs release
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: levo.delellis at gmail dot com
  Target Milestone: ---

This might be more than one bug and I gotten the compiler to crash. Tested on
apple ventura with an M2 but it may happen on ARMv8 linux. This slightly
different test also fails with O2 on my mac https://godbolt.org/z/xv883jMb9

gcc docs says 0 might be undefined, I understand that

> Built-in Function: int __builtin_ctz (unsigned int x) 
>     Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. 

>From my understanding armv8 doesn't have count trailing zero, it implements it
using rbits (reverse bits) and clz. clz says when you give it a 64bit register
it'll return 64 on 0. 
https://developer.arm.com/documentation/ddi0596/2020-12/Base-Instructions/CLZ--Count-Leading-Zeros-

Now here's the problem. I would think __builtin_ctz would be those two
instructions. So I tried the below, compiled and ran using `gcc -Wall -Wextra
test.c && ./a.out` and saw that it worked as expected. However mistake was
stopping there. Using `gcc -O2 -Wall -Wextra test.c && ./a.out` gets 456
instead, no warnings or anything. Looking at the assembly it appears the check
has been optimized out and 456 is used. 

Looking at the "ARM C Language Extensions Architecture Specification" it
suggested including arm_acle.h. So I replaced the line below with the following
line and still got the incorrect result

    unsigned long long tz = __clz((unsigned long long)__rbit(input));

I'm not sure if this is another bug but this crashes with -O2
https://godbolt.org/z/xv883jMb9 it also doesn't give me the result I expected.
rbit appears to give me 32 no matter what I write. Doc says it should give 64
https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/RBIT--vector---Reverse-Bit-order--vector--?lang=en

Anyway I would like to be warned against these problems somehow. Homebrew on
mac doesn't seem to have undefined behavior sanitizer (although I'm new to mac
and may have set it up incorrectly). The ubsan would be great to warn against
this. Alternative a flag such as -Wprefer-intrinsic could help, when either the
built ins don't match the CPU behavior.

>From what I can tell __builtin_ctzll doesn't seem to return 32, it seems like
it does return 64, when you compare the variable the optimizer seems to think
it will never be greater than 32 which was a problem in my code because I was
using bits >= 60 so I can't simply do >= 32.


        #include<stdio.h>
        int main(int argc, char *argv[])
        {       
                unsigned long long input = argc-1;
                unsigned long long v = __builtin_ctzll(input);
                printf("%d %d\n", argc, v >= 64 ? 123 : 456);
        }


More information about the Gcc-bugs mailing list