restrict keyword has no effect?
Bingfeng Mei
bmei@broadcom.com
Wed Jan 23 16:43:00 GMT 2008
Hello,
We are porting GCC 4.2.1 for our VLIW processor. To improve performance,
support of restrict keyword is imperative. From what I learn from GCC
documentation, "restrict" should be well supported since GCC3. Somehow,
I found it doesn't improve schedule even for simple example.
foo (int * restrict a, int * restrict b, int * restrict c) {
unsigned i;
for (i=0; i<256; i++){
a[i] = b[i] + c[i];
}
}
It is not only problem for our own porting. I also tried to compile for
ARM target.
arm-elf-gcc vectorize.c -O3 -std=c99 -S -funroll-all-loops
-fdump-tree-all
It just generate sequences of load/load/store as the code's natural
order suggests. The scheduler never tries to move load beyond previous
store instruction in order to reduce cycle.
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
ldr ip, [r2, #0]
stmfd sp!, {r4, lr}
mov r4, r1
ldr r1, [r1, #0]
mov lr, r2
add r2, ip, r1
str r2, [r0, #0]
mov ip, #4
ldr r1, [ip, lr]
ldr r3, [ip, r4]
add r2, r1, r3
str r2, [ip, r0]
add r3, ip, #4
ldr r1, [r3, r4]
ldr r2, [r3, lr]
add r2, r2, r1
str r2, [r3, r0]
add r3, r3, #4
ldr r2, [r3, lr]
ldr r1, [r3, r4]
add r2, r2, r1
str r2, [r3, r0]
add ip, ip, #12
.L2:
ldr r1, [ip, lr]
ldr r3, [ip, r4]
add r2, r1, r3
str r2, [ip, r0]
add r3, ip, #4
ldr r1, [r3, r4]
ldr r2, [r3, lr]
add r2, r2, r1
str r2, [r3, r0]
add r3, r3, #4
ldr r1, [r3, r4]
ldr r2, [r3, lr]
add r2, r2, r1
str r2, [r3, r0]
...
ldr r3, [r1, lr]
add r3, r3, r2
str r3, [r1, r0]
add r2, ip, #32
ldr r3, [r2, lr]
ldr r1, [r2, r4]
add ip, ip, #36
add r3, r3, r1
cmp ip, #1024
str r3, [r2, r0]
bne .L2
ldmfd sp!, {r4, pc}
.size foo, .-foo
.ident "GCC: (GNU) 4.2.2"
I examine produced tree-SSA files:
In 004t.gimple, the restrict keyword is preserved
foo (a, b, c)
{
unsigned int D.1352;
int * D.1353;
int * D.1354;
int * D.1355;
int D.1356;
int * D.1357;
int D.1358;
int D.1359;
unsigned int i;
i = 0;
goto <D1350>;
<D1349>:;
D.1352 = i * 4;
D.1353 = (int * restrict) D.1352;
D.1354 = D.1353 + a;
D.1352 = i * 4;
D.1353 = (int * restrict) D.1352;
D.1355 = D.1353 + b;
D.1356 = *D.1355;
D.1352 = i * 4;
D.1353 = (int * restrict) D.1352;
D.1357 = D.1353 + c;
D.1358 = *D.1357;
D.1359 = D.1356 + D.1358;
*D.1354 = D.1359;
i = i + 1;
<D1350>:;
if (i <= 255)
{
goto <D1349>;
}
else
{
goto <D1351>;
}
<D1351>:;
}
But in .final_cleanup file, the restrict key word just disppear.
foo (a, b, c)
{
long unsigned int ivtmp.49;
<bb 2>:
MEM[base: a] = MEM[base: c] + MEM[base: b];
ivtmp.49 = 4;
<L0>:;
MEM[base: a, index: ivtmp.49] = MEM[base: c, index: ivtmp.49] +
MEM[base: b, index: ivtmp.49];
ivtmp.49 = ivtmp.49 + 4;
if (ivtmp.49 != 1024) goto <L0>; else goto <L2>;
<L2>:;
return;
}
Any hint to produce efficient code with "restrict" keyword? Thank in
advance.
Cheers,
Bingfeng Mei
Broadcom UK
More information about the Gcc
mailing list