Another epic optimiser failure

Nicholas Vinson nvinson234@gmail.com
Sun May 28 06:28:44 GMT 2023


On 5/27/23 17:04, Stefan Kanthak wrote:
> --- .c ---
> int ispowerof2(unsigned long long argument) {
>      return __builtin_popcountll(argument) == 1;
> }
> --- EOF ---
>
> GCC 13.3    gcc -m32 -march=alderlake -O3
>              gcc -m32 -march=sapphirerapids -O3
>              gcc -m32 -mpopcnt -mtune=sapphirerapids -O3
>
> https://gcc.godbolt.org/z/cToYrrYPq
> ispowerof2(unsigned long long):
>          xor     eax, eax        # superfluous
>          xor     edx, edx        # superfluous
>          popcnt  eax, [esp+4]
>          popcnt  edx, [esp+8]
>          add     eax, edx
>          cmp     eax, 1      ->    dec  eax
>          sete    al
>          movzx   eax, al         # superfluous
>          ret
>
> 9 instructions in 28 bytes      # 6 instructions in 20 bytes

I agree this can be done using 6 instructions, but you cannot do it 
using the dec instruction. If you use the dec instruction, "movzx eax, 
al" becomes a required instruction (consider the case when the input is 
0) resulting in 7 instructions and 22 bytes.



More information about the Gcc mailing list