[Bug target/56676] New: unnecesary splitted load when using avx2

neleai at seznam dot cz gcc-bugzilla@gcc.gnu.org
Thu Mar 21 12:57:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56676

             Bug #: 56676
           Summary: unnecesary splitted load when using avx2
    Classification: Unclassified
           Product: gcc
           Version: 4.7.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: neleai@seznam.cz


Compile notorious example
int foo(int *a,int *b){
  int i;
  int r=0;
 for(i=0;i<32;i++) r+= a[i]*b[i];
  return r;
}
with -O3 -mavx2. gcc generates code that is suboptimal in several ways.
Part relevant to this bug is spliting 32byte load into two 16byte loads.

.L5:
  vmovdqu (%r8,%rdx), %xmm1
  addl  $1, %ecx
  vinserti128 $0x1, 16(%r8,%rdx), %ymm1, %ymm1
  vpmulld (%rbx,%rdx), %ymm1, %ymm1



More information about the Gcc-bugs mailing list