This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/54073] [4.7/4.8 Regression] SciMark Monte Carlo test performance has seriously decreased in recent GCC releases


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54073

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-11-13 13:04:28 UTC ---
Created attachment 28674
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28674
gcc48-pr54073.patch

On x86_64-linux on SandyBridge CPU with -O3 -march=corei7-avx I've tracked it
down to the 
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=171341
change, in particular emit_conditional_move part of the changes.
Before the change emit_conditional_move would completely ignore the predicate
on the comparison operand (operands[1]), starting with r171341 it honors it.
And the movsicc's ordered_comparison_operator would give up on the UNLT
comparison in the MonteCarlo testcase, while ix86_expand_int_movcc expands it
just fine and at least on that loop it is beneficial to use
        vucomisd        %xmm0, %xmm1
        cmovae  %eax, %ebp
instead of:
.L4:
        addl    $1, %ebx
...
        vucomisd        %xmm0, %xmm2
        jb      .L4

The attached proof of concept patch attempts to just restore the 4.6 and
earlier behavior by allowing in all comparisons.  Of course perhaps it might be
possible it needs better tuning than that, I meant it just as a start for
discussions.

vanilla trunk:

**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using       2.00 seconds min time per kenel.
Composite Score:         1886.79
FFT             Mflops:  1726.97    (N=1024)
SOR             Mflops:  1239.20    (100 x 100)
MonteCarlo:     Mflops:   374.13
Sparse matmult  Mflops:  1956.30    (N=1000, nz=5000)
LU              Mflops:  4137.37    (M=100, N=100)

patched trunk:

**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using       2.00 seconds min time per kenel.
Composite Score:         1910.08
FFT             Mflops:  1726.97    (N=1024)
SOR             Mflops:  1239.20    (100 x 100)
MonteCarlo:     Mflops:   528.94
Sparse matmult  Mflops:  1949.03    (N=1000, nz=5000)
LU              Mflops:  4106.27    (M=100, N=100)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]