[Bug target/82259] New: missed optimization: use LEA to add 1 to flip the low bit when copying before AND with 1
peter at cordes dot ca
gcc-bugzilla@gcc.gnu.org
Tue Sep 19 16:27:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82259
Bug ID: 82259
Summary: missed optimization: use LEA to add 1 to flip the low
bit when copying before AND with 1
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: peter at cordes dot ca
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
bool bt_signed(int x, unsigned bit) {
bit = 13;
return !(x & (1<<bit));
}
// https://godbolt.org/g/rzdtzm
movl %edi, %eax
sarl $13, %eax
notl %eax
andl $1, %eax
ret
This is pretty good, but we could do better by using addition instead of a
separate NOT. (XOR is add-without-carry. Adding 1 will always flip the low
bit).
sarl $13, %edi
lea 1(%edi), %eax
andl $1, %eax
ret
If partial-registers aren't a problem, this will be even better on most CPUs:
bt $13, %edi
setz %al
ret
related: bug 47769 about missed BTR peepholes. That probably covers the missed
BT.
But *this* bug is about the LEA+AND vs. MOV+NOT+AND optimization. This might
be relevant for other 2-operand ISAs with mostly destructive instructions, like
ARM Thumb.
Related:
bool bt_unsigned(unsigned x, unsigned bit) {
//bit = 13;
return !(x & (1<<bit)); // 1U avoids test/set
}
movl %esi, %ecx
movl $1, %eax
sall %cl, %eax
testl %edi, %eax
sete %al
ret
This is weird. The code generated with 1U << bit is like the bt_signed code
above and has identical results, so gcc should emit whatever is optimal for
both cases. There are similar differences on ARM32.
(With a fixed count, it just makes the difference between NOT vs. XOR $1.)
If we're going to use setcc, it's definitely *much* better to use bt instead
of a variable-count shift + test.
bt %esi, %edi
setz %al
ret
More information about the Gcc-bugs
mailing list