g77 performance comparison, i686

Tim Prince tprince@computer.org
Wed Jun 14 11:05:00 GMT 2000


Sorry about attaching a word processor file, but my mail readers don't
preserve column format.
Here are some Livermore Fortran Kernel results for various i686
compilers, in case anyone is interested in how GCC is doing on this type
of performance.

Compaq CVF6 (6A no better) Lahey LF90 gcc-2.95.2 gcc-2.96

KERNEL MFLOP/SEC SPAN PRECIS MFLOP/SEC PRECIS MFLOP/SEC PRECIS MFLOP/SEC
PRECIS

------ --------- ---- -------------- ----- --------- ----- --------- ---
--

1 101.896 27 16.90 138.726 16.90 95.212 16.90 106.062 16.90

2 43.337 15 16.90 56.166 16.90 67.524 16.90 64.694 16.90

3 99.355 27 16.90 114.338 16.90 104.816 16.90 98.899 16.90

4 67.889 27 16.90 102.113 16.90 101.278 16.90 109.658 16.90

5 49.083 27 16.90 53.043 16.90 51.335 16.90 51.664 16.90

6 65.554 8 16.90 49.806 16.90 58.156 16.90 57.325 16.90

7 123.789 21 16.90 144.149 16.90 84.247 16.90 101.224 16.90

8 106.125 14 16.90 113.624 16.90 69.872 16.90 58.846 16.90

9 134.808 15 16.90 153.769 16.85 81.441 16.85 88.404 16.85

10 48.130 15 16.90 72.199 16.90 54.424 16.90 45.499 16.90

11 61.139 27 16.90 64.277 16.90 57.934 16.90 55.985 16.90

12 59.171 26 - 13.66 47.503 13.66 60.842 13.66 63.243 13.66

13 5.705 8 14.79 6.186 15.71 6.504 15.71 5.514 15.71

14 16.521 27 15.27 18.753 15.29 19.319 15.29 14.834 15.29

15 24.665 15 16.90 20.387 16.79 26.593 16.90 24.806 16.79

16 27.678 15 16.90 49.266 16.90 47.785 16.90 50.285 16.90

17 45.338 15 16.65 62.424 16.90 63.559 16.90 67.255 16.90

18 84.786 14 16.65 74.610 16.73 65.203 16.90 68.746 16.73

19 49.497 15 16.90 57.337 16.90 55.935 16.90 52.791 16.90

20 36.735 26 16.90 34.230 16.83 41.384 16.83 33.936 16.83

21 125.109 20 16.65 106.753 16.72 144.313 16.72 106.198 16.72

22 10.759 15 16.90 16.892 16.90 14.470 16.90 14.390 16.90

23 109.753 14 16.90 140.484 16.90 101.695 16.90 85.359 16.90

24 16.745 27 16.90 25.638 16.90 47.558 16.90 57.019 16.90

1 107.213 101 16.65 147.170 16.76 101.301 16.76 112.702 16.76

2 71.781 101 16.90 79.154 16.90 93.712 16.90 89.150 16.90

3 119.572 101 16.90 124.262 16.90 121.283 16.90 120.357 16.90

4 108.979 101 16.90 127.219 16.90 158.124 16.90 160.706 16.90

5 49.604 101 16.65 56.004 16.72 55.729 16.72 55.793 16.72

6 100.946 32 16.90 85.505 16.90 94.932 16.90 107.595 16.90

7 121.556 101 16.90 147.316 16.90 86.260 16.90 103.560 16.90

8 91.575 100 16.90 94.158 16.81 65.779 16.90 56.884 16.81

9 103.278 101 16.90 122.601 16.90 72.329 16.90 84.754 16.90

10 39.116 101 16.90 44.069 16.90 42.950 16.90 39.644 16.90

11 68.732 101 16.65 72.559 16.90 69.849 16.90 69.424 16.90

12 66.293 100 15.21 48.510 15.21 82.707 15.21 81.739 15.21

13 5.913 32 15.52 6.398 15.90 6.684 15.90 5.752 15.90

14 15.522 101 16.90 16.891 16.79 18.361 16.79 14.408 16.79

15 23.353 101 16.90 19.281 16.90 25.125 16.90 24.381 16.90

16 27.371 40 16.90 46.698 16.90 48.776 16.90 54.667 16.90

17 43.601 101 16.90 69.611 16.90 73.451 16.90 76.317 16.90

18 75.061 100 16.90 69.451 16.81 61.504 16.90 62.221 9.23

19 51.670 101 16.90 61.112 16.90 56.390 16.90 55.473 16.90

20 35.985 100 16.65 35.201 16.74 42.766 16.74 32.875 16.74

21 124.505 50 16.90 104.421 16.90 141.572 16.90 105.115 16.90

22 10.929 101 16.90 17.079 16.90 14.534 16.90 14.405 16.90

23 87.527 100 16.48 95.380 16.41 74.088 16.41 65.638 16.41

24 18.543 101 16.90 33.605 16.90 54.475 16.90 56.933 16.90

1 68.687 1001 16.90 67.104 16.90 67.427 16.90 69.034 16.90

2 72.379 101 16.90 79.003 16.90 93.649 16.90 89.031 16.90

3 119.301 1001 16.48 121.344 16.90 125.036 16.90 123.118 16.90

4 147.087 1001 16.90 128.625 16.90 173.540 16.90 179.393 16.90

5 32.782 1001 16.90 31.478 16.90 32.717 16.90 32.356 16.90

6 92.870 64 16.90 79.326 16.90 85.917 16.90 92.793 16.90

7 114.465 995 16.90 123.378 16.90 75.311 16.90 94.120 16.90

8 91.556 100 16.90 93.531 16.81 65.750 16.90 56.662 16.81

9 100.215 101 16.90 122.770 16.90 72.399 16.90 86.366 16.90

10 38.367 101 16.90 44.462 16.90 42.992 16.90 39.946 16.90

11 70.888 1001 16.90 64.977 16.90 71.819 16.90 71.914 16.90

12 64.441 1000 13.24 42.665 13.24 80.817 13.24 80.790 13.24

13 5.986 64 16.05 6.519 15.41 6.762 15.41 5.886 15.41

14 14.378 1001 16.90 14.375 16.66 17.149 16.90 13.368 16.90

15 23.572 101 16.90 19.253 16.90 25.061 16.90 24.010 16.90

16 27.827 75 16.90 48.068 16.90 49.337 16.90 56.071 16.90

17 43.433 101 16.90 69.226 16.90 73.367 16.90 75.743 16.90

18 74.942 100 16.90 69.254 16.81 61.432 16.90 62.332 9.23

19 51.978 101 16.90 61.329 16.90 56.281 16.90 56.211 16.90

20 35.383 1000 16.00 35.295 16.90 42.170 16.90 32.365 16.90

21 123.410 101 16.90 104.426 16.90 141.525 16.90 104.424 16.90

22 10.948 101 16.90 17.050 16.90 14.552 16.90 14.384 16.90

23 87.478 100 16.48 95.316 16.41 73.901 16.41 65.138 16.41

24 19.618 1001 16.90 37.025 16.90 66.170 16.90 68.880 16.90



Comments:

Test machine is 232 Mhz P-II, 64MB RAM, Windows 2000. Linux results have
been shown to be nearly identical.

The version of LFK tested here, found at
http://members.aol.com/n8tm/lloops.shar.gz , has been modified to reduce
the dependence on the compiler recognizing strange code patterns. g77
will likely lose more than the commercial compilers by changing to the
standard version. Kernel 14, when compiled literally according to
standard, as all of these compilers do, reaches less than 50% of its
potential performance on the x86, even though the specific Cray-1
oriented code of the original has been sanitized, and in spite of
reaching cache performance saturation on most architectures. Kernels 1
and 23 exhibit large drops in performance on x86 with increasing loop
length, apparently due to cache saturation. On Kernel 1, cache
saturation produces the same performance regardless of compiler or
optimization setting.

CVF6A shows some significant improvements over CVF6, offset by even
larger regressions. These compilers are quite reliable in /debug mode,
but do not approach the reliability of the other released versions with
normal optimization. They set 53-bit precision mode, which appears to
account for occasional loss of reported accuracy.

The Lahey compiler tested here was the last version to use default
precision mode, and its performance was satisfactory for f77-style code
such as g77 accepts.

The gcc compilers are tested with options
'-Os -march=pentiumpro -ffast-math -funroll-loops' which produce the
best overall performance, although -O2 is faster in a few cases. The bad
result on Kernel 18 with gcc-2.96 is caused by the broken -funroll-loops
option.

The major regressions remaining in egcs-20000612 consist of the
broken -funroll-loops option, and the loss in performance in Kernels 8
13 20 and 21. Those cases are not helped by other optimizations such
as -O2. Except for 8, they remain satisfactory in comparison to the
commercial compilers.

Tim Prince
-------------- next part --------------
A non-text attachment was scrubbed...
Name: comps.doc
Type: application/msword
Size: 34816 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc/attachments/20000614/3553cf4e/attachment.doc>


More information about the Gcc mailing list