[Bug target/56676] New: unnecesary splitted load when using avx2
neleai at seznam dot cz
gcc-bugzilla@gcc.gnu.org
Thu Mar 21 12:57:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56676
Bug #: 56676
Summary: unnecesary splitted load when using avx2
Classification: Unclassified
Product: gcc
Version: 4.7.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: neleai@seznam.cz
Compile notorious example
int foo(int *a,int *b){
int i;
int r=0;
for(i=0;i<32;i++) r+= a[i]*b[i];
return r;
}
with -O3 -mavx2. gcc generates code that is suboptimal in several ways.
Part relevant to this bug is spliting 32byte load into two 16byte loads.
.L5:
vmovdqu (%r8,%rdx), %xmm1
addl $1, %ecx
vinserti128 $0x1, 16(%r8,%rdx), %ymm1, %ymm1
vpmulld (%rbx,%rdx), %ymm1, %ymm1
More information about the Gcc-bugs
mailing list