This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

alpha+loop unrolling=craziness


With this double loop:

#include <stdlib.h>

#define FLOAT float

void bench (int nvec)
{
  int *idat;
  int ireps;
  int num;
  int i,j;
  
  ireps = 1024*2048/nvec;
  num = nvec*1024/sizeof(int);
  
  idat = (int *) malloc(num*sizeof(int));
  
  for (j=0; j<ireps; ++j) 
    {
    for (i=0; i<num; ++i) 
	idat[i] = 18;
  }

  free(idat);
}

when compiled with options -O3 -funroll-all-loops on alphaev6-unknown-linux-gnu
by gcc-2.95.1, the main body of the inner loop is unrolled four times
and compiled to

$L40:
	stl $5,0($2)
	stl $5,4($2)
	addl $4,4,$4
	stl $5,8($2)
	stl $5,12($2)
	cmplt $4,$9,$1
	addq $2,16,$2
	bne $1,$L40

which is scheduled in 3 cycles. But with 

popov-80% /export/u10/egcs-test/bin/gcc -v
Reading specs from /export/u10/egcs-test/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.97/specs
Configured with:  --prefix=/export/u10/egcs-test --enable-checking=no
gcc version 2.97 20001023 (experimental)

this inner loop is compiled to

$L40:
	lda $1,1($4)
	stl $5,0($2)
	stl $5,4($2)
	lda $1,3($1)
	stl $5,8($2)
	stl $5,12($2)
	lda $2,16($2)
$L64:
	addl $1,$31,$4
	cmplt $4,$9,$1
	bne $1,$L40

which is scheduled in 6 cycles.  Something is choosing a really
slow way to add 4 to register $4 here.

Perhaps this test loop is simple enough that someone can see what
is going on.

Brad Lucier

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]