[Bug tree-optimization/17863] [4.0/4.1 Regression] threefold performance loss, not inlining as much

danalis at cis dot udel dot edu gcc-bugzilla@gcc.gnu.org
Thu Jun 30 22:16:00 GMT 2005


------- Additional Comments From danalis at cis dot udel dot edu  2005-06-30 22:16 -------
I'm looking at the reduced testcase from comment #6,
and I noticed that f() is declared double, but does not return anything.
Thus the code doesn't compile with -O3 -Wall -Werror.
If I fix the bug adding a "return(return *ap1)",
or by declaring f() to be void, the performance regression dissappears.

Here's the test harness I used to call the minimized testcase:

int main(int argc, char *argv[]){
    double ay[100][100];
    const double *py, *pz;
    double *dxb, *ap1;
    double sum=0;
    int i,j,k;

    for(i=0; i<100; i++){
        for(j=0; j<100; j++){
            ay[i][j] = 1000*(i+1)+2*(j+1);
        }
    }
    py  = ay[0];
    pz  = ay[1];
    dxb = ay[2];
    ap1 = ay[3];

    for(k=0; k<100; k++){
        for(i=0; i<10000; i++){
            for(j=0; j<12; j++){
                sum += f(py,pz,dxb,ap1,j,5);
                sum /= 2;
            }
        }
    }
    cout << sum << endl;
    return 0;
}

Is that ok?   I compiled this with -O3 -mtune=pentium.

Runtimes *without* the fix to f() were
0.31s, 8.72s, 8.83s and 8.80s when compiled with g++
2.95.3, 3.4.3, 4.0.0 and 4.1.0-20050625, respectively
(making this a large performance regression relative to gcc-2.95.3).
Runtimes *with* the fix were
0.34s, 0.28s, 0.36s, 0.32s when compiled with g++
2.95.3, 3.4.3, 4.0.0 and 4.1.0-20050625, respectively.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17863



More information about the Gcc-bugs mailing list