[Bug c++/100973] New: gcc does not optimise based on knowing that `_mm256_movemask_ps` returns less than 255

denis.yaroshevskij at gmail dot com gcc-bugzilla@gcc.gnu.org
Tue Jun 8 17:33:01 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100973

            Bug ID: 100973
           Summary: gcc does not optimise based on knowing that
                    `_mm256_movemask_ps` returns less than 255
           Product: gcc
           Version: og10 (devel/omp/gcc-10)
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: denis.yaroshevskij at gmail dot com
  Target Milestone: ---

Options: -O3 -std=c++20 -DNDEBUG -mavx

Code:

```
#include <immintrin.h>

int masking_should_evaporate(__m256 values) {
  int top_bits = _mm256_movemask_ps(values);
  top_bits &= 255;
  return top_bits;
}
```

Godbolt: https://gcc.godbolt.org/z/a81qPWcon


For this code top_bits &= 255 does not actually do anything. Clang can optimise
based on that:

```
       vmovmskps       eax, ymm0
       vzeroupper
       ret
```

It comes from real code.


More information about the Gcc-bugs mailing list