This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] performance with loops


Op vr 04-07-2003, om 21:56 schreef Toon Moene:
> Steven Bosscher wrote:
> Hmm, how about initializing the data you use (malloc just allocates the 
> space) - without it you could run into NaNs which would distort the 
> timing picture completely.

Maybe so, but in this case this has nothing to do with it.  Just look at
the assembly output and you can see that the code is just really poor.

But just to be sure, I replaced the line:
	  data1[i][j][k] = data2[i][j][k] * data3[i][j][k];
with:
	  data1[i][j][k] = 0.0;
and indeed, tree-ssa is still about 33% slower (2.97s avg. for mainline
vs. 3.89s avg. for tree-ssa). Sorry!

It may be interesting for people looking into this that with tree-ssa,
- we create a bigger stack frame
- with -fnew-ra performance is only 20% worse than mainline
- PRE doesn't make a difference at all.
- there are so many temporaries in the tree dumps?!

> BTW, what's wrong with:
> 
> PROGRAM TEST
> REAL, ALLOCATABLE :: A(:,:,:), B(:,:,:), C(:,:,:)
> READ*,L,M,N
> ALLOCATE(A(L,M,N),B(L,N,M),C(L,N,M))
> A=1.0;B=2.0;C=A+B
> PRINT*,SUM(C)
> END PROGRAM TEST

Not everyone has Fortran 9x?  ;-)

Gr.
Steven


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]