Bug 59578 - Overuse of v prefix for SSE instructions
Summary: Overuse of v prefix for SSE instructions
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.8.2
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-22 16:50 UTC by Mike Sharov
Modified: 2013-12-22 20:23 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Sharov 2013-12-22 16:50:38 UTC
typedef float v16sf __attribute__((vector_size(16)));
v16sf f (v16sf x)
{ return (__builtin_ia32_shufps (x, x, 0xff)); }

Compiled on a Haswell 4770 with -march=native -O emits:

vshufps $255, %xmm0, %xmm0, %xmm0

Even though all registers are the same and

shufps $255, %xmm0, %xmm0

would have worked just as well without the extra byte for the v prefix.
This happens with other __builtin instructions as well. For example:

typedef long long v16so __attribute__((vector_size(16)));
v16so k (v16so x)
{ return (__builtin_ia32_aeskeygenassist128 (x, 1)); }

Emits vaeskeygenassist even though no memory accesses are present.
Comment 1 Jakub Jelinek 2013-12-22 20:23:53 UTC
That is intentional, please read something about SSE to AVX transition penalties, e.g. http://software.intel.com/sites/default/files/m/d/4/1/d/8/11MC12_Avoiding_2BAVX-SSE_2BTransition_2BPenalties_2Brh_2Bfinal.pdf