Another epic optimiser failure
Nicholas Vinson
nvinson234@gmail.com
Sun May 28 06:28:44 GMT 2023
On 5/27/23 17:04, Stefan Kanthak wrote:
> --- .c ---
> int ispowerof2(unsigned long long argument) {
> return __builtin_popcountll(argument) == 1;
> }
> --- EOF ---
>
> GCC 13.3 gcc -m32 -march=alderlake -O3
> gcc -m32 -march=sapphirerapids -O3
> gcc -m32 -mpopcnt -mtune=sapphirerapids -O3
>
> https://gcc.godbolt.org/z/cToYrrYPq
> ispowerof2(unsigned long long):
> xor eax, eax # superfluous
> xor edx, edx # superfluous
> popcnt eax, [esp+4]
> popcnt edx, [esp+8]
> add eax, edx
> cmp eax, 1 -> dec eax
> sete al
> movzx eax, al # superfluous
> ret
>
> 9 instructions in 28 bytes # 6 instructions in 20 bytes
I agree this can be done using 6 instructions, but you cannot do it
using the dec instruction. If you use the dec instruction, "movzx eax,
al" becomes a required instruction (consider the case when the input is
0) resulting in 7 instructions and 22 bytes.
More information about the Gcc
mailing list