[Bug tree-optimization/92645] Hand written vector code is 450 times slower when compiled with GCC compared to Clang
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Nov 25 11:08:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92645
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Kind-of a testcase for SSE2, but this has a matching BIT_FIELD_REF at least,
but still "fails" at the vector source. Skia seems to pun to __int128
before doing the extracts somehow (maybe that's our intrinsics, who knows).
typedef unsigned short v8hi __attribute__((vector_size(16)));
typedef unsigned int v4si __attribute__((vector_size(16)));
void foo (v4si *dst, v8hi src)
{
unsigned int tem[8];
tem[0] = src[0];
tem[1] = src[1];
tem[2] = src[2];
tem[3] = src[3];
tem[4] = src[4];
tem[5] = src[5];
tem[6] = src[6];
tem[7] = src[7];
dst[0] = *(v4si *)tem;
dst[1] = *(v4si *)&tem[4];
}
More information about the Gcc-bugs
mailing list