Bug 115636 - Missing optimzation: fold ` (w << (unsigned int)((flag ? a : b) - 32768))` to ` flag ? w << (unsigned int)(a - 32768) : 0 `
Summary: Missing optimzation: fold ` (w << (unsigned int)((flag ? a : b) - 32768))` to...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 enhancement
Target Milestone: ---
Assignee: Andrew Pinski
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2024-06-25 09:58 UTC by Wayne Bruce
Modified: 2024-06-26 23:11 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-06-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wayne Bruce 2024-06-25 09:58:53 UTC
godbolt example: https://godbolt.org/z/W5hr6G8oK

code example: 
```cpp
short b, n;
void func(signed char w, signed char flag, unsigned long long int a) {
    n = 0 >= (w << (unsigned int)((flag ? a : b) - 32768));
}
```

In this case, when flag != 0, no matter what the value of b is, the value of (unsigned int)(b - 32768) is much greater than 8, which means that w will be shifted left more than 8 times. According to the behavior of gcc, the value of w will be assigned to 0, that is, as long as flag != 0, then w will be 0, regardless of the value of b.
Comment 1 Andrew Pinski 2024-06-25 22:08:27 UTC
;;   basic block 3, loop depth 0, maybe hot
;;    prev block 2, next block 4, flags: (NEW, VISITED)
;;    pred:       2 (TRUE_VALUE,EXECUTABLE)
  # VUSE <.MEM_13(D)>
  b.1_2 = bD.4438;
  # RANGE [irange] unsigned int [0, 32767][4294934528, +INF]
  _3 = (unsigned intD.15) b.1_2;
  # RANGE [irange] unsigned int [4294901760, +INF]
  iftmp.0_14 = _3 + 4294934528;
  goto <bb 5>; [INV]
;;    succ:       5 (FALLTHRU,EXECUTABLE)

;;   basic block 4, loop depth 0, maybe hot
;;    prev block 3, next block 5, flags: (NEW, VISITED)
;;    pred:       2 (FALSE_VALUE,EXECUTABLE)
  _4 = (unsigned intD.15) a_11(D);
  iftmp.0_12 = _4 + 4294934528;
;;    succ:       5 (FALLTHRU,EXECUTABLE)

;;   basic block 5, loop depth 0, maybe hot
;;    prev block 4, next block 1, flags: (NEW, VISITED)
;;    pred:       3 (FALLTHRU,EXECUTABLE)
;;                4 (FALLTHRU,EXECUTABLE)
  # iftmp.0_8 = PHI <iftmp.0_14(3), iftmp.0_12(4)>
  _5 = _1 << iftmp.0_8;


So we could see that bb3 is undefined due to the shift right afterwards and either change it to being __builtin_unreachable or __builtin_trap and then optimize it further. 

This is a job for gimple-ssa-isolate-paths to handle really.
Comment 2 Andrew Pinski 2024-06-26 03:18:01 UTC
Going to implement something for isolate-paths. It might not be exactly what was expecting here due it might be using a trap rather than 0. 

> According to the behavior of gcc

Note this is statement is not exactly "correct" it is just the behavior for gcc currently for this undefined behavior. 

Note I have a straw poll going on which behavior we should do (maybe even by default):
https://gcc.gnu.org/pipermail/gcc/2024-June/244213.html
Comment 3 Andrew Pinski 2024-06-26 23:11:41 UTC
So looking at what LLVM does (Note it does not handle the original case here).

It only handles the simple:
`1 << (a ? b : 1000)` cases.

That is:
```
int f0(int a, int b)
{
        if (a) b = 1000;
        int t =  1 << b;
        return t;
}
```
is optimized to `1 << b` without the condition.

But once you add something more complex in the condition, it does not do anything:
```
int g();
int f1(int a, int b)
{
        if (a) b = 1000;
        else b = g();
        int t =  1 << b;
        g();
        return t;
}
int f2(int a, int b)
{
        if (a) b = 1000;
        else g();
        int t =  1 << b;
        g();
        return t;
}
```

changing f0 to be:
```
        if (a) t = 0;
        else  t =  1 << b;
```
Generates worse code for that. 

I have to think this one through a little more I think ...
undefined behavior is fun.