Bug 114326 - Missed optimization for A || B when !B implies A.
Summary: Missed optimization for A || B when !B implies A.
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2024-03-13 14:19 UTC by Manolis Tsamis
Modified: 2024-03-13 21:41 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-03-13 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Manolis Tsamis 2024-03-13 14:19:19 UTC
The function below doesn't fold to return 0;

int cmp1(uint64_t d1, uint64_t d2) {
  if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2)
    return 0;
  return foo();
}

while the following function does: 

int cmp2(uint64_t d1, uint64_t d2) {
  if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0)
    return 0;
  return foo();
}

The functions are equivalent since the lhs and rhs of || don't have side effects.

In general, there pattern here is a side-effect free expression a || b where !b implies a should be optimized to true. As in the testcase above, a doesn't necessarily imply !b. Something similar could be stated for && expressions.

Complementary godbolt link: https://godbolt.org/z/qK5bYf36T
Comment 1 Andrew Pinski 2024-03-13 19:06:27 UTC
  _1 = d1_5(D) ^ d2_6(D);
  _2 = _1 & 43981;
  _10 = d1_5(D) != d2_6(D);
  _11 = _2 == 0;
  _12 = _10 | _11;

(d1 != d2) | ((d1 ^ d2) & CST) == 0)

Confirmed.

Obvious if the first part is false then d1 ^ d2 will be 0.

This will work though maybe there is another place where this can be handled ...

(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_and
    (bit_xor @0 @1)
    @2)
   @3))
 (bit_ior @n4 
  (cmp { build_zero_cst (TREE_TYPE (@0)); } @3))
)
Comment 2 ptomsich 2024-03-13 21:18:39 UTC
To copy the last piece of info from our internal tracker...

LLVM learned this new trick only in the run-up to LLVM 18.
Up until then, GCC and LLVM performed identically on this snippet.
Comment 3 Andrew Pinski 2024-03-13 21:41:47 UTC
(In reply to ptomsich from comment #2)
> To copy the last piece of info from our internal tracker...
> 
> LLVM learned this new trick only in the run-up to LLVM 18.
> Up until then, GCC and LLVM performed identically on this snippet.

Yes it looks like it is pattern matching what I suggested (well with and without the and).

Note we do need another pattern, one without the bit_and:
(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_xor @0 @1)
   @2))
 (bit_ior @n4 
  (cmp { build_zero_cst (TREE_TYPE (@0)); } @2))
)

And we need one more for bit_ior:
(simplify
 (bit_ior
  (ne@n4 @0 @1)
  (cmp
   (bit_ior
    (bit_xor @0 @1)
    @2)
   @3))
 (bit_ior @n4 
  (cmp @2 @3))
)

Note it looks like clang does not handle non-contants that well, (they handle d == 0 though).

E.g.:
```
int foo(void);
int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) {
  int t = ((d1 ^ d2) & c ) == (d);
  int t1 = d1 != d2;
  int tt = t | t1;
  return tt;
}

```

Should be optimized to:
int foo(void);
int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) {
  int t = 0 == d;
  int t1 = d1 != d2;
  int tt = t | t1;
  return tt;
}
```