Bug 108341 - argument to `__builtin_ctz` should be assumed non-zero when CTZ_DEFINED_VALUE_AT_ZERO says it is undefined
Summary: argument to `__builtin_ctz` should be assumed non-zero when CTZ_DEFINED_VALU...
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.2.1
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2023-01-09 12:10 UTC by LIU Hao
Modified: 2023-07-05 19:02 UTC (History)
5 users (show)

See Also:
Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description LIU Hao 2023-01-09 12:10:51 UTC
Godbolt: https://gcc.godbolt.org/z/PrPP4v9z1


```
extern int r;

int
bz(int value)
  {
    r = __builtin_ctz(value);
    return value != 0;  // always true
  }
```


According to GCC manual, if the argument to `__builtin_ctz()` is zero then the behavior is undefined, but GCC fails to assume that this function always returns `1`. 

But I have read https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94801, not sure whether it's related.
Comment 1 Martin Liška 2023-01-09 12:28:11 UTC
May be an opportunity for Ranger?
Comment 2 Aldy Hernandez 2023-01-09 15:55:07 UTC
(In reply to Martin Liška from comment #1)
> May be an opportunity for Ranger?

Hmmm... I don't think so:

    <bb 2> :
    value.0_1 = (unsigned int) value_4(D);
    _2 = __builtin_ctz (value.0_1);
    r = _2;
    _3 = value_4(D) != 0;
    _7 = (int) _3;
    return _7;

We could add an op1_range operator to class cfn_clz to return nonzero for op1, but that would only work if we knew _2 to be anything...and have no info on _2.
Comment 3 Andrew Pinski 2023-01-09 16:22:15 UTC
Not always.
It depends on the definition of CTZ_DEFINED_VALUE_AT_ZERO.

/* The value at zero is only defined for the BMI instructions
   LZCNT and TZCNT, not the BSR/BSF insns in the original isa.  */
#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
        ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_BMI ? 2 : 0)
#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
        ((VALUE) = GET_MODE_BITSIZE (MODE), TARGET_LZCNT ? 2 : 0)



So assuming !=0 with -mbmi would be an invalid assumitation.
Comment 4 Jakub Jelinek 2023-01-09 16:37:51 UTC
I think 0 argument for __builtin_c[lt]z{,l,ll,imax} is always undefined, 0 argument
to .C[LT]Z (internal calls) is undefined if C[LT]Z_DEFINED_VALUE_AT_ZERO is not 2 and
0 argument to C[LT]Z RTL is undefined if C[LT]Z_DEFINED_VALUE_AT_ZERO is not non-zero.
Comment 5 Andrew Macleod 2023-01-09 18:49:00 UTC
(In reply to Aldy Hernandez from comment #2)
> (In reply to Martin Liška from comment #1)
> > May be an opportunity for Ranger?
> 
> Hmmm... I don't think so:
> 
>     <bb 2> :
>     value.0_1 = (unsigned int) value_4(D);
>     _2 = __builtin_ctz (value.0_1);
>     r = _2;
>     _3 = value_4(D) != 0;
>     _7 = (int) _3;
>     return _7;
> 
> We could add an op1_range operator to class cfn_clz to return nonzero for
> op1, but that would only work if we knew _2 to be anything...and have no
> info on _2.

Seems more like a candidate for gimpe_infer::gimple_infer (gimple *s).

THe side effect to register would be to check if 's' is a builtin_ctz and if so, call add_nonzero (operand1) if whatever those other conditions are are matched which make it true.

That should register a non-zero inferred range on value.0_1 after the assignment of _2.


=========== BB 2 ============
Partial equiv (value.0_1 pe32 value_4(D))
    <bb 2> :
    value.0_1 = (unsigned int) value_4(D);
    _2 = __builtin_ctz (value.0_1);
    r = _2;
    _3 = value_4(D) != 0;
    _7 = (int) _3;
    return _7;

I see ranger also registers a 32 bit equivalence between value.0_1 and value_4, so in theory we would then be able to determine that value_4 is also non-zero for the comparison.
Comment 6 Aldy Hernandez 2023-01-09 19:02:26 UTC
Huh.  Didn't know you could do that.  Thanks.

FWIW, the function is actually:

gimple_infer_range::gimple_infer_range (gimple *s)
Comment 7 LIU Hao 2023-01-10 08:05:07 UTC
(In reply to Jakub Jelinek from comment #4)
> I think 0 argument for __builtin_c[lt]z{,l,ll,imax} is always undefined, 0
> argument
> to .C[LT]Z (internal calls) is undefined if C[LT]Z_DEFINED_VALUE_AT_ZERO is
> not 2 and
> 0 argument to C[LT]Z RTL is undefined if C[LT]Z_DEFINED_VALUE_AT_ZERO is not
> non-zero.

I agree with this.


#94801 mentioned the `if(value == 0) __builtin_unreachable();` trick, but it isn't an option if the argument is really possibly a zero:

(https://gcc.godbolt.org/z/dePvcMhTr)
```
#include <stdint.h>

uint32_t
my_tzcnt(uint32_t value)
  {
    return (value == 0) ? 32 : __builtin_ctz(value);
  }
```

This can be TZCNT if the CPU supports it.