This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[tree-ssa] Write to global memory not hoisted out of loop
- From: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 30 Apr 2004 13:14:30 +0200 (CEST)
- Subject: [tree-ssa] Write to global memory not hoisted out of loop
Hi!
Writes to global memory are not hoisted out of a computation loop. F.e.
double tmp;
void foo(double * __restrict__ x, double * __restrict__ y, double *
__restrict__ z, int n)
{
int i;
for (i=0; i<n; ++i) {
tmp = y[i]*z[i];
x[i] = tmp;
}
}
is compiled to (-O2 -fno-trapping-math):
...
.L4:
leal 0(,%edx,8), %eax #, tmp66
incl %edx # i
fldl (%eax,%ebx) #* z
cmpl %ecx, %edx # n, i
fmull (%eax,%esi) #* y
fstl tmp # tmp
fstpl (%eax,%edi) #* x
jl .L4 #,
...
note that the load from tmp is optimized, but the store is still inside
the loop. The store would be necessary in case of trapping math
operations to make side-effects visible, but I thought, -fno-trapping-math
should get rid of it. Making tmp static doesn't help either, while here
removing the store would be valid even in case of trapping math. Turning
on points-to doesn't help either. Intel compiler moves the store in any
case (which seems to be a bug).
Is there already a PR about this pessimization? (Couldn't find one)
Richard.
--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/