This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
SPEC / testsuite results for disabling SFTs and the alias-oracle patches
- From: Richard Guenther <rguenther at suse dot de>
- To: gcc at gcc dot gnu dot org
- Date: Wed, 5 Mar 2008 12:48:57 +0100 (CET)
- Subject: SPEC / testsuite results for disabling SFTs and the alias-oracle patches
Here are SPEC CPU 2000 results with plain trunk and the two alias-oracle
patches. Base results are plain -O3 -ffast-math, peak results include
--param max-fields-for-field-sensitive=0 which effectively disables the
creation of SFTs.
Unpatched (three runs):
Estimated Estimated
Base Base Base Peak Peak Peak
Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
========================================================================
164.gzip 1400 100 1400 * 1400 99.9 1402 *
175.vpr 1400 80.3 1742 * 1400 80.5 1738 *
176.gcc 1100 48.1 2288 * 1100 46.8 2353 *
181.mcf 1800 131 1371 * 1800 131 1370 *
186.crafty 1000 38.0 2635 * 1000 36.6 2732 *
197.parser 1800 134 1348 * 1800 133 1353 *
252.eon X X
253.perlbmk 1800 70.8 2541 * 1800 70.4 2557 *
254.gap 1100 57.3 1921 * 1100 57.1 1925 *
255.vortex X X
256.bzip2 1500 79.3 1892 * 1500 79.9 1877 *
300.twolf 3000 114 2635 * 3000 114 2633 *
Est. SPECint_base2000 1914
Est. SPECint2000 1927
Estimated Estimated
Base Base Base Peak Peak Peak
Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
========================================================================
168.wupwise 1600 79.9 2002* 1600 80.0 2000*
171.swim 3100 155 1999* 3100 155 1999*
172.mgrid 1800 98.6 1825* 1800 98.4 1829*
173.applu 2100 178 1178* 2100 178 1181*
177.mesa 1400 57.8 2421* 1400 58.1 2411*
178.galgel 2900 69.0 4204* 2900 69.0 4203*
179.art 2600 34.7 7482* 2600 34.1 7617*
183.equake 1300 74.1 1755* 1300 74.0 1757*
187.facerec 1900 75.3 2523* 1900 75.3 2522*
188.ammp 2200 119 1845* 2200 119 1843*
189.lucas 2000 119 1688* 2000 118 1697*
191.fma3d 2100 132 1590* 2100 131 1598*
200.sixtrack 1100 120 919* 1100 120 918*
301.apsi 2600 171 1518* 2600 172 1509*
Est. SPECfp_base2000 2029
Est. SPECfp2000 2032
Patched (three runs):
Estimated Estimated
Base Base Base Peak Peak Peak
Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
========================================================================
164.gzip 1400 100 1400 * 1400 99.9 1401 *
175.vpr 1400 80.0 1751 * 1400 80.1 1749 *
176.gcc 1100 47.4 2319 * 1100 46.8 2352 *
181.mcf 1800 133 1358 * 1800 133 1349 *
186.crafty 1000 37.6 2656 * 1000 36.8 2718 *
197.parser 1800 133 1350 * 1800 133 1349 *
252.eon X X
253.perlbmk 1800 70.4 2557 * 1800 70.0 2573 *
254.gap 1100 57.3 1918 * 1100 57.4 1918 *
255.vortex X X
256.bzip2 1500 79.9 1877 * 1500 80.6 1862 *
300.twolf 3000 114 2641 * 3000 114 2638 *
Est. SPECint_base2000 1918
Est. SPECint2000 1923
Estimated Estimated
Base Base Base Peak Peak Peak
Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
========================================================================
168.wupwise 1600 80.2 1995* 1600 80.1 1998*
171.swim 3100 156 1993* 3100 155 1994*
172.mgrid 1800 98.7 1824* 1800 98.6 1826*
173.applu 2100 178 1178* 2100 178 1178*
177.mesa 1400 57.8 2422* 1400 57.9 2417*
178.galgel 2900 69.3 4188* 2900 69.2 4191*
179.art 2600 36.8 7063* 2600 33.5 7762*
183.equake 1300 74.0 1756* 1300 74.1 1754*
187.facerec 1900 76.0 2500* 1900 74.0 2569*
188.ammp 2200 119 1846* 2200 119 1845*
189.lucas 2000 117 1706* 2000 117 1703*
191.fma3d 2100 130 1612* 2100 129 1633*
200.sixtrack 1100 120 920* 1100 119 921*
301.apsi 2600 173 1505* 2600 174 1498*
Est. SPECfp_base2000 2020
Est. SPECfp2000 2039
you can see that in both cases the runs without SFTs are significantly
better(!) Which hints at the fact that we do a poor job with parititoning
and/or that partitioning triggers earlier with SFTs enabled.
The oracle patches are able to slightly improve the results in the non-SFT
case, but overall there is less difference patched vs. unpatched compared
to the differences that result if you disable SFTs.
If you compare testresults with SFTs disabled unpatched vs. patched you
can see that the oracle patches can retain optimizations that were only
possible with SFTs previously (uninteresting parts snipped, full testsuite
for all default languages was run, -m32 results only if they differ
from -m64 results):
unpatched, SFTs disabled:
=== g++ tests ===
Running target unix/
FAIL: g++.dg/torture/pr34850.C -O0 (test for warnings, line 14)
FAIL: g++.dg/torture/pr34850.C -O1 (test for warnings, line 14)
FAIL: g++.dg/torture/pr34850.C -O2 (test for warnings, line 14)
FAIL: g++.dg/torture/pr34850.C -O3 -fomit-frame-pointer (test for warnings, line 14)
FAIL: g++.dg/torture/pr34850.C -O3 -g (test for warnings, line 14)
FAIL: g++.dg/torture/pr34850.C -Os (test for warnings, line 14)
=== g++ Summary for unix/ ===
# of expected passes 17440
# of unexpected failures 6
# of expected failures 82
# of unsupported tests 119
=== gcc tests ===
Running target unix/
FAIL: gcc.dg/tree-ssa/alias-10.c scan-tree-dump optimized "return 3;"
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128"
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2
FAIL: gcc.dg/tree-ssa/alias-3.c scan-tree-dump optimized "return 1;"
FAIL: gcc.dg/tree-ssa/alias-4.c scan-tree-dump optimized "return 1;"
FAIL: gcc.dg/tree-ssa/alias-5.c scan-tree-dump optimized "return 1;"
FAIL: gcc.dg/tree-ssa/ldist-4.c scan-tree-dump-times ldist "distributed: split to 2 loops" 0
FAIL: gcc.dg/tree-ssa/loadpre8.c scan-tree-dump-times pre "Eliminated: 1" 1
FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4
FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2
FAIL: gcc.dg/tree-ssa/structopt-1.c scan-tree-dump-times lim "Executing store motion of global.y" 1
FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.e" 0
FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.f" 0
FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.g" 0
FAIL: gcc.dg/tree-ssa/structopt-3.c scan-tree-dump-times optimized "return 11" 1
=== gcc Summary ===
# of expected passes 97489
# of unexpected failures 41
# of expected failures 335
# of untested testcases 70
# of unsupported tests 839
/space/rguenther/obj/gcc/xgcc version 4.4.0 20080304 (experimental) (GCC)
Patched results:
=== g++ tests ===
Running target unix/
FAIL: g++.dg/tree-ssa/pr34355.C (test for excess errors)
=== g++ Summary for unix/ ===
# of expected passes 17445
# of unexpected failures 1
# of expected failures 82
# of unsupported tests 119
=== gcc tests ===
Running target unix/
FAIL: gcc.dg/autopar/parallelization-1.c (internal compiler error)
FAIL: gcc.dg/autopar/parallelization-1.c (test for excess errors)
FAIL: gcc.dg/autopar/parallelization-1.c scan-tree-dump-times final_cleanup "loopfn" 5
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128"
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2
FAIL: gcc.dg/tree-ssa/ldist-4.c scan-tree-dump-times ldist "distributed: split to 2 loops" 0
FAIL: gcc.dg/tree-ssa/loop-32.c scan-tree-dump-times lim "Executing store motion of" 7
FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4
FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2
=== gcc Summary for unix/ ===
# of expected passes 48691
# of unexpected failures 15
# of expected failures 166
# of untested testcases 35
# of unsupported tests 478
Running target unix//-m32
FAIL: gcc.dg/autopar/parallelization-1.c (internal compiler error)
FAIL: gcc.dg/autopar/parallelization-1.c (test for excess errors)
FAIL: gcc.dg/autopar/parallelization-1.c scan-tree-dump-times final_cleanup "loopfn" 5
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128"
FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2
FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4
FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2
=== gcc Summary for unix//-m32 ===
# of expected passes 48839
# of unexpected failures 13
# of expected failures 167
# of untested testcases 35
# of unsupported tests 361
Some of the fails with SFTs disabled are actually because the testcases
scan for SFTs in the dumps, which are obviously not available. Those
tests need to be disabled or adjusted to test optimization outcome
instead.
Thus, with the above results I propose we disable generating SFTs by
default on the mainline (--para max-fields-for-field-sensitive=100
is still available for comparision). I will prepare a patch to adjust
the false negative testcases above to check for optimization outcome
as well.
Richard.