[Bug target/94962] Suboptimal AVX2 code for _mm256_zextsi128_si256(_mm_set1_epi8(-1))

nemo@self-evident.org gcc-bugzilla@gcc.gnu.org
Mon May 18 15:43:51 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94962

--- Comment #5 from Nemo <nemo@self-evident.org> ---
(In reply to Jakub Jelinek from comment #2)

I would be happy if GCC could just emit optimal code (single vcmpeqd
instruction) for this useful constant:

    _mm256_set_m128i(_mm_setzero_si128(), _mm_set1_epi8(-1))

aka.

    _mm256_inserti128_si256(_mm256_setzero_si256(), _mm_set1_epi8(-1), 0)


(The latter is just what GCC uses to implement _mm256_zextsi128_si256, if I am
reading the headers correctly.)

It's a minor thing, but I was a little surprised to find that none of the
compilers I know of are able to do this. At least, not with any input I tried.


More information about the Gcc-bugs mailing list