94892 – (x >> 31) + 1 not getting narrowed to compare

Bug 94892 - (x >> 31) + 1 not getting narrowed to compare

Summary: (x >> 31) + 1 not getting narrowed to compare

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	10.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:	19987
	Show dependency tree / graph

Reported:	2020-04-30 17:25 UTC by Gabriel Ravier
Modified:	2023-08-24 21:46 UTC (History)
CC List:	2 users (show)

See Also:	85234
Host:
Target:	x86_64--
Build:
Known to work:
Known to fail:
Last reconfirmed:	2021-12-15 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Gabriel Ravier 2020-04-30 17:25:03 UTC

inline int sign(int x)
{
    return (x >> 31) | ((unsigned)-x >> 31);
}

bool f(int x)
{
    return sign(x) > -1;
}

With -O3, LLVM produces this :

f(int):
  test edi, edi
  setns al
  ret

GCC produces this :

f(int):
  sar edi, 31
  lea eax, [rdi+1]
  ret

Changing `f` to `(x >> 31) + 1` results in it being optimized optimally

Comment 1 Andrew Pinski 2020-04-30 18:34:40 UTC

GCC's code gen is actually better than LLVM's here I think.

Comment 2 Gabriel Ravier 2020-04-30 19:32:50 UTC

In that case, then, GCC is generating sub-optimal code for `(x >> 31) + 1` alone since it optimises that to the same thing as LLVM

Comment 3 Harald van Dijk 2020-05-01 19:28:00 UTC

Changing the test slightly to

  inline int sign(int x)
  {
    return (x >> 31) | ((unsigned)-x >> 31);
  }

  void f1(void);
  void f2(void);
  void f(int x)
  {
    if (sign(x) > -1)
      f1();
    else
      f2();
  }

shows at -O3 with LLVM:

  f:
    test    edi, edi
    js      .LBB0_2
    jmp     f1
  .LBB0_2:
    jmp     f2

whereas GCC produces:

  f:
    mov     eax, edi
    sar     edi, 31
    neg     eax
    shr     eax, 31
    or      edi, eax
    cmp     edi, -1
    je      .L2
    jmp     f1
  .L2:
    jmp     f2

In that example, LLVM is doing much better.

Comment 4 Gabriel Ravier 2020-08-10 00:19:27 UTC

Harald, for the specific code you wrote, I now see this from GCC :

f(int):
  test edi, edi
  js .L2
  jmp f1()
.L2:
  jmp f2()

So for that specific code, GCC has improved, though for the original test case it has not. I must note, though, that this still means that either :
- GCC fails to properly optimize `(x >> 31) + 1`
- GCC fails to properly optimize my original test case

Comment 5 Andrew Pinski 2021-12-15 00:23:43 UTC

Confirmed.
Testcase:

  inline int sign(int x)
  {
    return (x >> 31) | ((unsigned)-x >> 31);
  }
bool f1(int x)
{
    return sign(x) > -1;
}
bool f3(int x)
{
    return (x>>31)+1;
}

  void f1(void);
  void f2(void);
  void f(int x)
  {
    if (sign(x) > -1)
      f1();
    else
      f2();
  }
  void f0(int x)
  {
    if (f3(x))
      f1();
    else
      f2();
  }
  void fn1(int x)
  {
    if (x>0)
      f1();
    else
      f2();
  }



f, f0 and fn1 should all produce the same result on the gimple level but don't.

(x >> 31) | ((unsigned)-x >> 31) is the same as x == 0 ? 0 : -(x < 0) really.
Which we should simplify into.
Though COND_EXPR is not as well handled by the rest of GCC really.

Comment 6 Andrew Pinski 2021-12-15 00:31:12 UTC

Two things we should simplify:
  _4 = _3 >> 31;
  _4 != -1

Into:
  _3 >= 0 (if _3 is signed, otherwise false)

(this will solve f0)


And:
  _4 = x_3(D) >> 31;
  _7 = -x_3(D);
  _8 = (unsigned int) _7;
  _9 = _8 >> 31;
  _10 = (int) _9;
  _11 = _4 | _10;

Into:
_4 = x_3(D) >> 31; // keep around
_t = _4 | 1;
_11 = x != 0 ? _t : 0;

Comment 7 Andrew Pinski 2023-05-28 23:15:43 UTC

(In reply to Andrew Pinski from comment #6)
> Two things we should simplify:
>   _4 = _3 >> 31;
>   _4 != -1
> 
> Into:
>   _3 >= 0 (if _3 is signed, otherwise false)
> 
> (this will solve f0)

See bug 85234 comment #5 on handle that one (g and g2).