[Bug target/115069] [14/15 regression] 8 bit integer vector performance regression, x86, between gcc-14 and gcc-13 using avx2 target clones on skylake platform
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Fri May 17 08:48:22 GMT 2024
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069
--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #8)
> A better patch:
The real issue is that the following permutation (truncation):
+ for (i = 0; i < d.nelt; ++i)
+ d.perm[i] = i * 2;
+
+ ok = ix86_expand_vec_perm_const_1 (&d);
results in a slow code involving VPERMQ. Ideally, ix86_expand_vec_perm_const_1
should emit faster code for truncation, because this will benefit other code as
well.
More information about the Gcc-bugs
mailing list