This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [tree-ssa] performance with loops

From: Daniel Berlin <dberlin at dberlin dot org>
To: Steven Bosscher <s dot bosscher at student dot tudelft dot nl>
Cc: Toon Moene <toon at moene dot indiv dot nluug dot nl>,dnovillo at redhat dot com,gcc at gcc dot gnu dot org
Date: Fri, 4 Jul 2003 17:41:47 -0400
Subject: Re: [tree-ssa] performance with loops
References: <1057331717.3640.33.camel@steven.lr-s.tudelft.nl> <3F05DBD0.20609@moene.indiv.nluug.nl> <1057350835.3653.8.camel@steven.lr-s.tudelft.nl>

On Friday, July 4, 2003, at 4:33 PM, Steven Bosscher wrote:

Op vr 04-07-2003, om 21:56 schreef Toon Moene:
Steven Bosscher wrote: Hmm, how about initializing the data you use (malloc just allocates the space) - without it you could run into NaNs which would distort the timing picture completely.
Maybe so, but in this case this has nothing to do with it. Just look at the assembly output and you can see that the code is just really poor.
But just to be sure, I replaced the line:
	  data1[i][j][k] = data2[i][j][k] * data3[i][j][k];
with:
	  data1[i][j][k] = 0.0;
and indeed, tree-ssa is still about 33% slower (2.97s avg. for mainline
vs. 3.89s avg. for tree-ssa). Sorry!
It may be interesting for people looking into this that with tree-ssa,
- we create a bigger stack frame
- with -fnew-ra performance is only 20% worse than mainline
- PRE doesn't make a difference at all.


I'm about to fix that.
It'll eliminate the redundant address computations now.

Note that they aren't strictly redundant, because of the casts that actually end up appearing:

i.1_19 = (unsigned int)i_1
<address calculation using i.1_19>
....
i.1_48 = (unsigned int)i_1;
<address calculation using i.1_48>

Note the redefinition of i between uses. It considers these non-redundant, because they aren't, since they are using different definitions of i, even though the definitions themselves are equal. This would be solved by value numbering replacing them all with i.1_19 or something.

However, because we have no value numbering, to work around this, I just taught PRE that they are the same if they defined by a copy of the same version.

References:
- [tree-ssa] performance with loops
  - From: Steven Bosscher
- Re: [tree-ssa] performance with loops
  - From: Toon Moene
- Re: [tree-ssa] performance with loops
  - From: Steven Bosscher

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]