The following GIMPLE test shows non-optimal assembly long mask; void bar (); __GIMPLE () void foo (int a, int b) { long _3; _3 = a_1(D) < b_2(D) ? _Literal (long) -1l : 0l; mask = _3; if (a_1(D) < b_2(D)) goto bb1; else goto bb2; bb1: bar (); bb2: return; } foo: .LFB0: .cfi_startproc xorl %eax, %eax cmpl %esi, %edi setge %al subq $1, %rax movq %rax, mask(%rip) cmpl %esi, %edi jl .L5 ... here subq clobbers flags and thus the cmpl has to be repeated. I believe we could use lea which also has the same size leaq -0x1(%rax), %rax here instead and elide the redundant cmpl. For my purpose the store to mask is unnecessary, it was placed to simplify the testcase. A GIMPLE testcase was necessary to get the COND_EXPR and non-jumpy code through optimization. I'm not sure at which point during RTL we commit to using a CC clobbering sub vs. a non-CC clobbering lea, but maybe cmpelim could replace one with the other here?
Confirmed, we get now: xorl %eax, %eax cmpl %esi, %edi setl %al negq %rax movq %rax, mask(%rip) cmpl %esi, %edi jl .L5 Because we produce similar to the following C testcase: long mask; void bar (); void f (int a, int b) { long _3; _3 = a < b; _3 = -_3; mask = _3; if (a < b) bar (); return; }