Bug 117529

Summary: Missed optimization: (y != 0 && x > (unsigned)(-1) / y) (multiplication overflow check)
Product: gcc Reporter: Kang-Che Sung <Explorer09>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: enhancement CC: a1343922569, BenBE, gabravier
Priority: P3 Keywords: missed-optimization
Version: 14.1.0   
Target Milestone: ---   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98703
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30314
https://github.com/llvm/llvm-project/issues/115683
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119917
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2024-11-11 00:00:00

Description Kang-Che Sung 2024-11-11 03:52:47 UTC
```c
#include <limits.h>
#include <stdbool.h>

bool func1(unsigned long long x, unsigned long long y)
{
    return x > (ULLONG_MAX / y);
}

bool func2(unsigned long long x, unsigned long long y)
{
    return y != 0 && x > (ULLONG_MAX / y);
}

bool func3(unsigned long long x, unsigned long long y)
{
    unsigned long long res;
    return __builtin_umulll_overflow(x, y, &res);
}
```

Expected result: All three functions produce the same code.

Actual result: func1() and func3() optimize to same code, but func2() had a redundant (y != 0) check that is not optimized out.

x86-64 gcc with "-Os" option (tested in Compiler Explorer, a.k.a. godbolt.org)

```x86asm
func1:
        movq    %rdi, %rax
        mulq    %rsi
        seto    %al
        ret
func2:
        xorl    %eax, %eax
        testq   %rsi, %rsi
        je      .L5
        movq    %rsi, %rax
        mulq    %rdi
        seto    %al
        movzbl  %al, %eax
.L5:
        andl    $1, %eax
        ret
```
Comment 1 Drea Pinski 2024-11-11 03:57:46 UTC
  <bb 2> [local count: 1073741824]:
  if (y_4(D) != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  _1 = .MUL_OVERFLOW (y_4(D), x_5(D));
  _2 = IMAGPART_EXPR <_1>;
  _8 = (bool) _2;

  <bb 4> [local count: 1073741824]:
  # iftmp.0_3 = PHI <_8(3), 0(2)>
  return iftmp.0_3;
Comment 2 Drea Pinski 2024-11-11 03:58:52 UTC
Reduced down to just:
```
bool func4(unsigned long long x, unsigned long long y)
{
    if (y == 0) return 0;
    unsigned long long res;
    return __builtin_umulll_overflow(x, y, &res);
}
```
Comment 3 Kang-Che Sung 2024-11-11 08:03:02 UTC
The func4 example that Andrew Pinski provided makes me realize that this func5 should optimize to the same code:

```c
bool func5(unsigned long long x, unsigned long long y)
{
    if ((x | y) == 0) return 0;
    unsigned long long res;
    return __builtin_umulll_overflow(x, y, &res);
}
```
Comment 4 Kang-Che Sung 2024-11-11 08:06:41 UTC
Oops. Correction. I mean this:

```c
bool func5(unsigned long long x, unsigned long long y)
{
    /* (x == 0 || y == 0) */
    if ((x & y) == 0) return 0;
    unsigned long long res;
    return __builtin_umulll_overflow(x, y, &res);
}
```
Comment 5 Kang-Che Sung 2024-11-11 08:12:32 UTC
Sorry again. My mind wasn't cleaned up when I wrote the comment. It's (x == 0 || y == 0) and the logic is not equivalent to ((x & y) == 0).