This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] performance with loops
Op vr 04-07-2003, om 21:56 schreef Toon Moene:
> Steven Bosscher wrote:
> Hmm, how about initializing the data you use (malloc just allocates the
> space) - without it you could run into NaNs which would distort the
> timing picture completely.
Maybe so, but in this case this has nothing to do with it. Just look at
the assembly output and you can see that the code is just really poor.
But just to be sure, I replaced the line:
data1[i][j][k] = data2[i][j][k] * data3[i][j][k];
with:
data1[i][j][k] = 0.0;
and indeed, tree-ssa is still about 33% slower (2.97s avg. for mainline
vs. 3.89s avg. for tree-ssa). Sorry!
It may be interesting for people looking into this that with tree-ssa,
- we create a bigger stack frame
- with -fnew-ra performance is only 20% worse than mainline
- PRE doesn't make a difference at all.
- there are so many temporaries in the tree dumps?!
> BTW, what's wrong with:
>
> PROGRAM TEST
> REAL, ALLOCATABLE :: A(:,:,:), B(:,:,:), C(:,:,:)
> READ*,L,M,N
> ALLOCATE(A(L,M,N),B(L,N,M),C(L,N,M))
> A=1.0;B=2.0;C=A+B
> PRINT*,SUM(C)
> END PROGRAM TEST
Not everyone has Fortran 9x? ;-)
Gr.
Steven