[Bug tree-optimization/105904] New: Predicated mov r0, #1 with opposite conditions could be hoisted, between 1 and 1<<n in opposite sides of a branch
peter at cordes dot ca
gcc-bugzilla@gcc.gnu.org
Thu Jun 9 07:33:26 GMT 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105904
Bug ID: 105904
Summary: Predicated mov r0, #1 with opposite conditions could
be hoisted, between 1 and 1<<n in opposite sides of a
branch
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: peter at cordes dot ca
Target Milestone: ---
Target: arm-*-*
#include <bit> // using the libstdc++ header
unsigned roundup(unsigned x){
return std::bit_ceil(x);
}
https://godbolt.org/z/Px1fvWaex
GCC's version is somewhat clunky, including MOV r0, #1 in either "side":
roundup(unsigned int):
cmp r0, #1
itttt hi
addhi r3, r0, #-1
movhi r0, #1 @@ here
clzhi r3, r3
rsbhi r3, r3, #32
ite hi
lslhi r0, r0, r3
movls r0, #1 @@ here
bx lr
Even without spotting the other optimizations that clang finds, we can combine
to a single unconditional MOV r0, #1. But only if we avoid setting flags, so
it requires a 4-byte encoding, not MOVS. Still, it's one fewer instruction to
execute.
This is not totally trivial: it requires seeing that we can move it across the
conditional LSL. So it's really a matter of folding the 1s between 1<<n and 1
in opposite sides of an if-converted branch.
cmp r0, #1
ittt hi
addhi r3, r0, #-1
clzhi r3, r3
rsbhi r3, r3, #32
mov r0, #1 @@ now unconditional
it hi
lslhi r0, r0, r3
bx lr
clang makes rather nice asm for ARMv7 -mcpu=cortex-a53 as discussed in PR104773
which covers a different missed optimization in the same asm.
roundup(unsigned int): @@ clang's version.
subs r0, r0, #1
clz r0, r0
rsb r1, r0, #32 @ 32-clz
mov r0, #1
lslhi r0, r0, r1 @ using flags set by SUBS
bx lr @ 1<<(32-clz) or just 1
Folding the mov r0, #1 from either side is only a couple steps away from making
the clz and rsb unconditional, and keeping only the LSL conditional.
More information about the Gcc-bugs
mailing list