Bug 53397 - Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due to generation more prefecthes
Summary: Scimark performance drops by 10x times when compiled -O3 -march=amdfam10 due ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.7.1
: P3 major
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 79703
  Show dependency treegraph
 
Reported: 2012-05-18 12:02 UTC by Venkataramanan
Modified: 2017-02-24 09:29 UTC (History)
3 users (show)

See Also:
Host: x86_64-unknown-linux-gnu
Target: x86_64-unknown-linux-gnu
Build: x86_64-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2012-05-18 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Venkataramanan 2012-05-18 12:02:13 UTC
With GCC4.7 the benchmark score drops from ~400 Mflops to ~40 mflops. Almost 10 folds.

Prefecth instructions introduced in the innermost loops of "FFT_transform_internal" ( FFT.c ) in GCC4.7 but not in GCC4.6 which is causing the slow down. 

Compiling this function alone as a separate test case with -fno-prefetch-loop-arrays brings back the original score.

The problem is exposed http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175474

With GCC r175473
--------------------------
gcc -O3 -march=amdfam10 *.c -o Scimark175473 -lm vekumar@pcedinar5:/local/home/vekumar/SciMark2_bench/SciMark2> ./Scimark175473
**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using       2.00 seconds min time per kenel.
Composite Score:           99.67
FFT             Mflops:   498.35    (N=1024)

With GCC r175474
-------------------------
gcc -O3 -march=amdfam10 *.c -o Scimark175474 -lm vekumar@pcedinar5:/local/home/vekumar/SciMark2_bench/SciMark2> ./Scimark175474
**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using       2.00 seconds min time per kenel.
Composite Score:            7.73
FFT             Mflops:    38.66    (N=1024)
Comment 1 Richard Biener 2012-05-18 12:11:07 UTC
Confirmed.
Comment 2 Venkataramanan 2012-10-09 15:55:04 UTC
Fixed.
http://gcc.gnu.org/viewcvs?view=revision&revision=192261