This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/67553] Saturating SSE/AVX instructions do not get optimized
- From: "tmb99 at gmx dot net" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 11 Sep 2015 19:24:24 +0000
- Subject: [Bug rtl-optimization/67553] Saturating SSE/AVX instructions do not get optimized
- Auto-submitted: auto-generated
- References: <bug-67553-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67553
--- Comment #2 from tmb99 at gmx dot net ---
seems to be the same for most saturating instructions:
__m128i v0 = _mm_setzero_si128();
__m128i v2 = _mm_setzero_si128();
__m128i sum = _mm_adds_epi16(v0,v2);
__m128i dif = _mm_subs_epi8(v0,v2);
__m128i hsum = _mm_hadds_epi16(v0,v2);
__m128i hdif = _mm_hsubs_epi16(v0,v2);
__m128i pacu = _mm_packus_epi16(v0,v2);
__m128i pacs = _mm_packs_epi32(v0,v2);
compiles to:
vpxor %xmm0, %xmm0, %xmm0
vpxor %xmm2, %xmm2, %xmm2
vphsubsw %xmm0, %xmm0, %xmm4
vpackuswb %xmm0, %xmm0, %xmm3
vphaddsw %xmm0, %xmm0, %xmm5
vpsubsb %xmm2, %xmm2, %xmm2
vpxor %xmm1, %xmm1, %xmm1
vpaddsw %xmm0, %xmm0, %xmm0
vpackssdw %xmm1, %xmm1, %xmm1
also: 3 setzero/vpxor instructions instead of just one.