This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[tree-ssa] performance with loops
- From: Steven Bosscher <s dot bosscher at student dot tudelft dot nl>
- To: dnovillo at redhat dot com, gcc at gcc dot gnu dot org
- Date: 04 Jul 2003 17:15:18 +0200
- Subject: [tree-ssa] performance with loops
Diego,
A C++ fluid dynamics code I am working with performs really bad when
compiled with tree-ssa compared to mainline. It takes about 3 times as
long for tree-ssa (that's the difference between running a simulation
overnight or having to wait a whole day and not being able to use your
computer...).
I have tried to narrow down the code to a small test case, and I have
found an example that shows ~66% slowdown; the code I used for these
timings is attached.
Timings for tree-ssa:
real 0m6.852s 0m6.846s 0m6.882s
user 0m6.680s 0m6.690s 0m0.160s
sys 0m0.170s 0m6.690s 0m0.190s
Timings for mainline:
real 0m4.090s 0m4.086s 0m4.087s
user 0m3.910s 0m3.880s 0m3.870s
sys 0m0.180s 0m0.200s 0m0.220s
Ratio: 1,67 1.68 1.68
(The machine is an Athlon XP2000, 256MB ram)
Maybe something like this can also explain the slowdown in 183.equake
(which is the only SPECfp2000 benchmark that slows down with
tree-ssa)???
Gr.
Steven
---------------------------------------
#define L 1000
#define W 200
#define H 200
float ***data1, ***data2, ***data3;
void __attribute__((noinline))
foo (void)
{
int i, j, k;
for (i = 0; i < L; i++)
for (j = 0; j < W; j++)
for (k = 0; k < H; k++)
{
data1[i][j][k] = data2[i][j][k] * data3[i][j][k];
}
}
int
main (void)
{
float ***x;
int i, j;
x = (float ***) malloc (L*sizeof(float**));
for (i = 0; i < L; i++)
{
x[i] = (float **) malloc (W*sizeof(float*));
for (j = 0; j < W; j++)
x[i][j] = (float *) malloc (H*sizeof(float));
}
data1 = data2 = data3 = x;
/* Loops 10 times to spread the
overhead of the malloc. */
for (i = 1; i < 10; i++)
foo ();
free (data1);
}