Take: int f(int a, int b, int c, int d) { return a > b && c > d; } ---- CUT --- Currently on riscv32 at -O2 produces: f: ble a0,a1,.L3 sgt a0,a2,a3 ret .L3: li a0,0 ret But add --param logical-op-non-short-circuit=1, produces: f: sgt a0,a0,a1 sgt a2,a2,a3 and a0,a0,a2 ret Which is much better, especially on in-order cores.
Note clang/LLVM produces the branch-less version also.
Well branch cost should be more tuned too. Here is an example where BRANCH_COST=4 is needed to get one branch: ``` int g(void); int f(int a, int b, int c, int d) { if (a > b && c > d) return g(); return 1; } ``` Note clang/LLVM produces much worse code (two xori which are not needed).
we usually define logical-op-non-short-circuit based on branch cost
(In reply to Richard Biener from comment #3) > we usually define logical-op-non-short-circuit based on branch cost Right, I think this definition was copied from the MIPS backend even which is wrong there too which was done back in 2005 way before the main uses of LOGICAL_OP_NON_SHORT_CIRCUIT was done.
Dup. *** This bug has been marked as a duplicate of bug 116615 ***