This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/82259] missed optimization: use LEA to add 1 to flip the low bit when copying before AND with 1


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82259

--- Comment #4 from Peter Cordes <peter at cordes dot ca> ---
(In reply to Uroš Bizjak from comment #2)
> A couple of *scc_bt patterns are missing. These are similar to already
> existing *jcc_bt patterns. Combine wants:

Does gcc also need patterns for bt + cmovcc?

Thinking about this again, with an immediate count <= 31 it might be best to
test $0x0100, %edi / setz %al.  BT might be shorter, needing only an imm8
instead of imm32.  But TEST can run on more ports than BT on Intel.  (Ryzen has
4 per clock bt throughput).

(In some registers, TEST can check the low8 or high8 using an imm8, but high8
can have extra latency on HSW/SKL:
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to.
 But test $imm8, %al is only 2 bytes, or 3 bytes for low8 other than AL if a
REX isn't needed.  There's no test $imm8_sign_extended, r32/r64, so you need a
REX to test the low byte of edi/esi/ebp.)

But for a variable count, it's likely that BT is the best bet, even when
booleanizing with setcc.  At least if we avoid `movzx`, because bt/setcc/movzx
is significantly worse than  xor-zero / bt / setcc, for latency and for a false
dependency on the destination register.

With a constant count, SHR / AND is very good if we don't need to invert the
boolean, and it's ok to destroy the source register.  (Or of course just SHR if
we want the high bit).  If adding new BT/SETCC patterns, I guess we need to
make sure gcc still uses SHR or SHR/AND where appropriate.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]