restrict keyword has no effect?

Bingfeng Mei bmei@broadcom.com
Wed Jan 23 16:43:00 GMT 2008


Hello,
We are porting GCC 4.2.1 for our VLIW processor. To improve performance,
support of restrict keyword is imperative. From what I learn from GCC
documentation, "restrict" should be well supported since GCC3. Somehow,
I found it doesn't improve schedule even for simple example.


foo (int * restrict a, int * restrict b, int * restrict c) {
  unsigned i;

  for (i=0; i<256; i++){
    a[i] = b[i] + c[i];
  }
}

It is not only problem for our own porting. I also tried to compile for
ARM target. 
arm-elf-gcc vectorize.c -O3 -std=c99 -S -funroll-all-loops
-fdump-tree-all

It just generate sequences of load/load/store as the code's natural
order suggests. The scheduler never tries to move load beyond previous
store instruction in order to reduce cycle.

foo:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	ldr	ip, [r2, #0]
	stmfd	sp!, {r4, lr}
	mov	r4, r1
	ldr	r1, [r1, #0]
	mov	lr, r2
	add	r2, ip, r1
	str	r2, [r0, #0]
	mov	ip, #4
	ldr	r1, [ip, lr]
	ldr	r3, [ip, r4]
	add	r2, r1, r3
	str	r2, [ip, r0]
	add	r3, ip, #4
	ldr	r1, [r3, r4]
	ldr	r2, [r3, lr]
	add	r2, r2, r1
	str	r2, [r3, r0]
	add	r3, r3, #4
	ldr	r2, [r3, lr]
	ldr	r1, [r3, r4]
	add	r2, r2, r1
	str	r2, [r3, r0]
	add	ip, ip, #12
.L2:
	ldr	r1, [ip, lr]
	ldr	r3, [ip, r4]
	add	r2, r1, r3
	str	r2, [ip, r0]
	add	r3, ip, #4
	ldr	r1, [r3, r4]
	ldr	r2, [r3, lr]
	add	r2, r2, r1
	str	r2, [r3, r0]
	add	r3, r3, #4
	ldr	r1, [r3, r4]
	ldr	r2, [r3, lr]
	add	r2, r2, r1
	str	r2, [r3, r0]
      ...
	ldr	r3, [r1, lr]
	add	r3, r3, r2
	str	r3, [r1, r0]
	add	r2, ip, #32
	ldr	r3, [r2, lr]
	ldr	r1, [r2, r4]
	add	ip, ip, #36
	add	r3, r3, r1
	cmp	ip, #1024
	str	r3, [r2, r0]
	bne	.L2
	ldmfd	sp!, {r4, pc}
	.size	foo, .-foo
	.ident	"GCC: (GNU) 4.2.2"

I examine produced tree-SSA files:

In 004t.gimple, the restrict keyword is preserved
foo (a, b, c)
{
  unsigned int D.1352;
  int * D.1353;
  int * D.1354;
  int * D.1355;
  int D.1356;
  int * D.1357;
  int D.1358;
  int D.1359;
  unsigned int i;

  i = 0;
  goto <D1350>;
  <D1349>:;
  D.1352 = i * 4;
  D.1353 = (int * restrict) D.1352;
  D.1354 = D.1353 + a;
  D.1352 = i * 4;
  D.1353 = (int * restrict) D.1352;
  D.1355 = D.1353 + b;
  D.1356 = *D.1355;
  D.1352 = i * 4;
  D.1353 = (int * restrict) D.1352;
  D.1357 = D.1353 + c;
  D.1358 = *D.1357;
  D.1359 = D.1356 + D.1358;
  *D.1354 = D.1359;
  i = i + 1;
  <D1350>:;
  if (i <= 255)
    {
      goto <D1349>;
    }
  else
    {
      goto <D1351>;
    }
  <D1351>:;
}

But in .final_cleanup file,  the restrict key word just disppear.
foo (a, b, c)
{
  long unsigned int ivtmp.49;

<bb 2>:
  MEM[base: a] = MEM[base: c] + MEM[base: b];
  ivtmp.49 = 4;

<L0>:;
  MEM[base: a, index: ivtmp.49] = MEM[base: c, index: ivtmp.49] +
MEM[base: b, index: ivtmp.49];
  ivtmp.49 = ivtmp.49 + 4;
  if (ivtmp.49 != 1024) goto <L0>; else goto <L2>;

<L2>:;
  return;

}

Any hint to produce efficient code with "restrict" keyword?  Thank in
advance.

Cheers,
Bingfeng Mei 
Broadcom UK



More information about the Gcc mailing list