inline int sign(int x) { return (x >> 31) | ((unsigned)-x >> 31); } bool f(int x) { return sign(x) > -1; } With -O3, LLVM produces this : f(int): test edi, edi setns al ret GCC produces this : f(int): sar edi, 31 lea eax, [rdi+1] ret Changing `f` to `(x >> 31) + 1` results in it being optimized optimally
GCC's code gen is actually better than LLVM's here I think.
In that case, then, GCC is generating sub-optimal code for `(x >> 31) + 1` alone since it optimises that to the same thing as LLVM
Changing the test slightly to inline int sign(int x) { return (x >> 31) | ((unsigned)-x >> 31); } void f1(void); void f2(void); void f(int x) { if (sign(x) > -1) f1(); else f2(); } shows at -O3 with LLVM: f: test edi, edi js .LBB0_2 jmp f1 .LBB0_2: jmp f2 whereas GCC produces: f: mov eax, edi sar edi, 31 neg eax shr eax, 31 or edi, eax cmp edi, -1 je .L2 jmp f1 .L2: jmp f2 In that example, LLVM is doing much better.
Harald, for the specific code you wrote, I now see this from GCC : f(int): test edi, edi js .L2 jmp f1() .L2: jmp f2() So for that specific code, GCC has improved, though for the original test case it has not. I must note, though, that this still means that either : - GCC fails to properly optimize `(x >> 31) + 1` - GCC fails to properly optimize my original test case
Confirmed. Testcase: inline int sign(int x) { return (x >> 31) | ((unsigned)-x >> 31); } bool f1(int x) { return sign(x) > -1; } bool f3(int x) { return (x>>31)+1; } void f1(void); void f2(void); void f(int x) { if (sign(x) > -1) f1(); else f2(); } void f0(int x) { if (f3(x)) f1(); else f2(); } void fn1(int x) { if (x>0) f1(); else f2(); } f, f0 and fn1 should all produce the same result on the gimple level but don't. (x >> 31) | ((unsigned)-x >> 31) is the same as x == 0 ? 0 : -(x < 0) really. Which we should simplify into. Though COND_EXPR is not as well handled by the rest of GCC really.
Two things we should simplify: _4 = _3 >> 31; _4 != -1 Into: _3 >= 0 (if _3 is signed, otherwise false) (this will solve f0) And: _4 = x_3(D) >> 31; _7 = -x_3(D); _8 = (unsigned int) _7; _9 = _8 >> 31; _10 = (int) _9; _11 = _4 | _10; Into: _4 = x_3(D) >> 31; // keep around _t = _4 | 1; _11 = x != 0 ? _t : 0;
(In reply to Andrew Pinski from comment #6) > Two things we should simplify: > _4 = _3 >> 31; > _4 != -1 > > Into: > _3 >= 0 (if _3 is signed, otherwise false) > > (this will solve f0) See bug 85234 comment #5 on handle that one (g and g2).