This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Performance comparison of gcc releases
- From: Ronny Peine <RonnyPeine at gmx dot de>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 16 Dec 2005 01:31:30 +0100
- Subject: Performance comparison of gcc releases
Hi,
i forgot to post the best cflags for each gcc-version and benchmark.
Here are the results:
gcc-3.3.6:
nbench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe
-fforce-addr -fsched-spec-load -fmove-all-movables -ffast-math -ftracer
-funroll-loops -funroll-all-loops -mfpmath=sse -momit-leaf-frame-pointer
freebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fsched-spec-load -fmove-all-movables -freduce-all-givs -ftracer
-funroll-all-loops -fprefetch-loop-arrays -mfpmath=sse
-momit-leaf-frame-pointer
lamebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fsched-spec-load -fmove-all-movables -freduce-all-givs -funroll-loops
-funroll-all-loops -mfpmath=sse -mfpmath=sse,387 -momit-leaf-frame-pointer
gcc-3.4.4:
nbench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks
-fsched2-use-traces -fmove-all-movables -ffast-math -funroll-loops
-funroll-all-loops -fpeel-loops -fold-unroll-loops
-fbranch-target-load-optimize2 -mfpmath=sse -mfpmath=sse,387
freebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks
-fsched2-use-traces -freduce-all-givs -ffast-math -ftracer -funroll-loops
-funroll-all-loops -fpeel-loops -fold-unroll-loops -fold-unroll-all-loops
-fbranch-target-load-optimize -fbranch-target-load-optimize2 -mfpmath=sse
-mfpmath=sse,387 -momit-leaf-frame-pointer
lamebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks
-fsched2-use-traces -fmove-all-movables -freduce-all-givs -ftracer
-funroll-loops -funroll-all-loops -fpeel-loops -fold-unroll-loops
-fold-unroll-all-loops -fbranch-target-load-optimize
-fbranch-target-load-optimize2 -mfpmath=sse -mfpmath=sse,387
-momit-leaf-frame-pointer
gcc-4.0.2:
nbench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr
-fmodulo-sched -fgcse-sm -fgcse-las -fsched-spec-load -ftree-vectorize
-ftracer -funroll-loops -fvariable-expansion-in-unroller
-fprefetch-loop-arrays -freorder-blocks-and-partition -fweb -ffast-math
-fmove-loop-invariants -fbranch-target-load-optimize
-fbranch-target-load-optimize2 -fbtr-bb-exclusive -momit-leaf-frame-pointer
-D__NO_MATH_INLINES
freebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fmodulo-sched
-fsched-spec-load -freschedule-modulo-scheduled-loops -ftree-vectorize
-ftracer -funroll-loops -fvariable-expansion-in-unroller
-fprefetch-loop-arrays -freorder-blocks-and-partition -fmove-loop-invariants
-fbranch-target-load-optimize -fbranch-target-load-optimize2
-fbtr-bb-exclusive -momit-leaf-frame-pointer -D__NO_MATH_INLINES
lamebench:
-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fgcse-sm
-fgcse-las -fsched-spec-load -fsched2-use-superblocks -fsched2-use-traces
-freschedule-modulo-scheduled-loops -ftracer -funroll-loops
-fvariable-expansion-in-unroller -freorder-blocks-and-partition -fweb
-ffast-math -fpeel-loops -fmove-loop-invariants -fbranch-target-load-optimize
-fbranch-target-load-optimize2 -fbtr-bb-exclusive -mfpmath=sse
-mfpmath=sse,387 -momit-leaf-frame-pointer -D__NO_MATH_INLINES
The time for one benchmark and one compiler takes from 6 to 48 hours and
depends heavily on the given testingflags (the used algorithm for
flagfiltering is O(n^2)).
The testingflags for each compiler is:
gcc-3.3.6:
TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fmove-all-movables|-freduce-all-givs|-ffast-math|
-ftracer|-funroll-loops|-funroll-all-loops|-fprefetch-loop-arrays|-mfpmath=sse|-mfpmath=sse,387|
-momit-leaf-frame-pointer"
gcc-3.4.4:
TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fsched2-use-superblocks|
-fsched2-use-superblocks -fsched2-use-traces|-fmove-all-movables|
-freduce-all-givs|-ffast-math|-ftracer|-funroll-loops|-funroll-all-loops|
-fpeel-loops|-fold-unroll-loops|-fold-unroll-all-loops|-fprefetch-loop-arrays|
-fbranch-target-load-optimize|-fbranch-target-load-optimize2|-mfpmath=sse|
-mfpmath=sse,387|-momit-leaf-frame-pointer"
gcc-4.0.2:
TESTINGFLAGS="-fforce-addr|-fmodulo-sched|-fgcse-sm|-fgcse-las|-fsched-spec-load|
-fsched2-use-superblocks -fsched2-use-traces|
-freschedule-modulo-scheduled-loops| -ftree-vectorize|
-ftracer|-funroll-loops|-fvariable-expansion-in-unroller|
-fprefetch-loop-arrays|-freorder-blocks-and-partition|-fweb|-ffast-math|-fpeel-loops|
-fmove-loop-invariants|-fbranch-target-load-optimize|-fbranch-target-load-optimize2|
-fbtr-bb-exclusive|-mfpmath=sse|-mfpmath=sse,387|-momit-leaf-frame-pointer|-D__NO_MATH_INLINES"
-ftree-loop-linear is removed from the testingflags in gcc-4.0.2 because it
leads to an endless loop in neural net in nbench.