The following example: #include <stdint.h> int f( int16_t a[16] ) { int res = 0; for ( int i = 0; i < 16; i++ ) res |= (a[i]); return res; } gets vectorized with the ORs happening as int32, but since | can't overflow or underflow the ORs could have been done as int16. This saves 2 ORs and 4 widenings. I would have expected overwidening detection to handle this, however that fails because VRP has no range information for `a` but in this case it's safe to do based on the type alone.
Confirmed.