This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[tree-ssa] Write to global memory not hoisted out of loop


Hi!

Writes to global memory are not hoisted out of a computation loop. F.e.

double tmp;
void foo(double * __restrict__ x, double * __restrict__ y, double *
__restrict__ z, int n)
{
  int i;
  for (i=0; i<n; ++i) {
    tmp = y[i]*z[i];
    x[i] = tmp;
  }
}

is compiled to (-O2 -fno-trapping-math):

...
.L4:
        leal    0(,%edx,8), %eax        #, tmp66
        incl    %edx    # i
        fldl    (%eax,%ebx)     #* z
        cmpl    %ecx, %edx      # n, i
        fmull   (%eax,%esi)     #* y
        fstl    tmp     # tmp
        fstpl   (%eax,%edi)     #* x
        jl      .L4     #,
...

note that the load from tmp is optimized, but the store is still inside
the loop.  The store would be necessary in case of trapping math
operations to make side-effects visible, but I thought, -fno-trapping-math
should get rid of it.  Making tmp static doesn't help either, while here
removing the store would be valid even in case of trapping math.  Turning
on points-to doesn't help either.  Intel compiler moves the store in any
case (which seems to be a bug).

Is there already a PR about this pessimization? (Couldn't find one)

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]