This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/22497] New: A register is wasted in simple vectorised loops


Hello!

Consider this simple testcase:

#define N 16

short ia[N];
short ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
short ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};


int main ()
{
  int i;

  for (i = 0; i < N; i++)
    ia[i] = ib[i] + ic[i];

  return 0;
}

The loop in this testcase is compiled with 'gcc -O2 -ftree-vectorize -msse2' 
into:

.L2:
	movdqa	ib(%eax), %xmm0
	paddw	ic(%eax), %xmm0
	incl	%edx
	movdqa	%xmm0, ia(%eax)
	addl	$16, %eax
	cmpl	$2, %edx
	jne	.L2

There is no (,%reg,16) SIB mode available in i386, and it looks to me that loop 
optimizer fallbacks to the most simple addressing mode in this case. 
Unfortunatelly, %edx register is wasted in above code.

A better code would be:

.L2:
	movdqa	ib(,%eax,8), %xmm0
	paddw	ic(,%eax,8), %xmm0
	movdqa	%xmm0, ia(,%eax,8)
	addl	$2, %eax
	cmpl	$4, %eax
	jne	.L2

or with the simplest addressing scheme:

.L2:
	movdqa	ib(%eax), %xmm0
	paddw	ic(%eax), %xmm0
	movdqa	%xmm0, ia(%eax)
	addl	$16, %eax
	cmpl	$32, %eax
	jne	.L2

Uros.

-- 
           Summary: A register is wasted in simple vectorised loops
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P2
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: uros at kss-loka dot si
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22497


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]