[Bug tree-optimization/92645] Hand written vector code is 450 times slower when compiled with GCC compared to Clang

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Jan 13 10:45:20 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92645

--- Comment #24 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 49958
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49958&action=edit
unincluded GCC source

The GCC source no longer compiles due to missing changes in the x86 intrinsic
includes in the preprocessed source:

...
/aux/hubicka/trunk-install2/lib/gcc/x86_64-pc-linux-gnu/10.0.0/include/avx512vlbwintrin.h:
In function 'void _mm_mask_cvtsepi16_storeu_epi8(void*, __mmask8, __m128i)':
/aux/hubicka/trunk-install2/lib/gcc/x86_64-pc-linux-gnu/10.0.0/include/avx512vlbwintrin.h:258:38:
error: cannot convert '__v8qi*' to 'long long unsigned int*'
<built-in>: note:   initializing argument 1 of 'void
__builtin_ia32_pmovswb128mem_mask(long long unsigned int*, __vector(8) short
int, unsigned char)'
In file included from
/aux/hubicka/trunk-install2/lib/gcc/x86_64-pc-linux-gnu/10.0.0/include/immintrin.h:69,
                 from
/aux/hubicka/firefox-2019-2/gfx/skia/skia/src/opts/SkOpts_
...

attached unincluded source that can be compiled with trunk and GCC 10
when using -march=haswell


More information about the Gcc-bugs mailing list