Bug 37732 - [4.2/4.3/4.4 Regression] 40% slower numeric sort
Summary: [4.2/4.3/4.4 Regression] 40% slower numeric sort
Status: RESOLVED DUPLICATE of bug 21485
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.0
: P3 critical
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 21485
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-03 22:27 UTC by wbrana
Modified: 2008-10-04 20:13 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build: x86_64-pc-linux-gnu
Known to work: 3.4.6
Known to fail: 4.2.4 4.3.2 4.4.0
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wbrana 2008-10-03 22:27:43 UTC
nbench 2.2.3 numeric sort test executes 40% less iterations per second
when compiled with 4.4 snapshot than with 3.4.6

iterations/s - version
2439 - 3.4.6
1530 - 4.4.0 20080926 (experimental)
1526 - 4.3.2 
1580 - 4.2.4

CFLAGS = -s -static -Wall -O3 -g0 -march=nocona -fomit-frame-pointer -funroll-loops -ffast-math

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          2439.1  :      62.55  :      20.54
STRING SORT         :          373.28  :     166.79  :      25.82
BITFIELD            :      5.9879e+08  :     102.71  :      21.45
FP EMULATION        :          267.84  :     128.52  :      29.66
FOURIER             :           43702  :      49.70  :      27.92
ASSIGNMENT          :          56.657  :     215.59  :      55.92
IDEA                :          5407.7  :      82.71  :      24.56
HUFFMAN             :          3204.3  :      88.86  :      28.37
NEURAL NET          :          57.485  :      92.35  :      38.84
LU DECOMPOSITION    :          2363.5  :     122.44  :      88.41
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 111.792
FLOATING-POINT INDEX: 82.519
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : Dual GenuineIntel Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz 3200MHz
L2 Cache            : 4096 KB
OS                  : Linux 2.6.26.5
C compiler          : gcc version 3.4.6 (Gentoo 3.4.6-r2 p1.5, ssp-3.4.6-1.0, pie-8.7.10)
libc                : libc-2.8.so
MEMORY INDEX        : 31.404
INTEGER INDEX       : 25.525
FLOATING-POINT INDEX: 45.768
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.


BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          1530.9  :      39.26  :      12.89
STRING SORT         :          372.64  :     166.51  :      25.77
BITFIELD            :      6.0348e+08  :     103.52  :      21.62
FP EMULATION        :           310.4  :     148.94  :      34.37
FOURIER             :           31760  :      36.12  :      20.29
ASSIGNMENT          :          48.361  :     184.02  :      47.73
IDEA                :            9204  :     140.77  :      41.80
HUFFMAN             :          3554.3  :      98.56  :      31.47
NEURAL NET          :          73.882  :     118.69  :      49.92
LU DECOMPOSITION    :          2322.2  :     120.30  :      86.87
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 114.457
FLOATING-POINT INDEX: 80.190
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : Dual GenuineIntel Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz 3200MHz
L2 Cache            : 4096 KB
OS                  : Linux 2.6.26.5
C compiler          : gcc version 4.4.0 20080926 (experimental) (GCC)
libc                : libc-2.8.so
MEMORY INDEX        : 29.851
INTEGER INDEX       : 27.632
FLOATING-POINT INDEX: 44.477
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.


BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          1526.2  :      39.14  :      12.85
STRING SORT         :          374.28  :     167.24  :      25.89
BITFIELD            :      6.0262e+08  :     103.37  :      21.59
FP EMULATION        :          320.64  :     153.86  :      35.50
FOURIER             :           33352  :      37.93  :      21.30
ASSIGNMENT          :          57.834  :     220.07  :      57.08
IDEA                :            9288  :     142.06  :      42.18
HUFFMAN             :            3211  :      89.04  :      28.43
NEURAL NET          :           74.69  :     119.98  :      50.47
LU DECOMPOSITION    :          2655.7  :     137.58  :      99.35
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 116.415
FLOATING-POINT INDEX: 85.548
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : Dual GenuineIntel Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz 3200MHz
L2 Cache            : 4096 KB
OS                  : Linux 2.6.26.5
C compiler          : gcc-4.3.2
libc                : libc-2.8.so
MEMORY INDEX        : 31.716
INTEGER INDEX       : 27.199
FLOATING-POINT INDEX: 47.448
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.



BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          1580.2  :      40.52  :      13.31
STRING SORT         :          359.88  :     160.80  :      24.89
BITFIELD            :      5.9462e+08  :     102.00  :      21.30
FP EMULATION        :          291.28  :     139.77  :      32.25
FOURIER             :           32567  :      37.04  :      20.80
ASSIGNMENT          :          57.097  :     217.26  :      56.35
IDEA                :          7876.8  :     120.47  :      35.77
HUFFMAN             :          3433.1  :      95.20  :      30.40
NEURAL NET          :          72.091  :     115.81  :      48.71
LU DECOMPOSITION    :          2664.4  :     138.03  :      99.67
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 112.739
FLOATING-POINT INDEX: 83.966
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : Dual GenuineIntel Intel(R) Core(TM)2 Duo CPU     E6750  @ 2.66GHz 3200MHz
L2 Cache            : 4096 KB
OS                  : Linux 2.6.26.5
C compiler          : gcc-4.2.4
libc                : libc-2.8.so
MEMORY INDEX        : 31.032
INTEGER INDEX       : 26.138
FLOATING-POINT INDEX: 46.571
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
Comment 1 Richard Biener 2008-10-03 22:36:06 UTC
Dup of PR21485?
Comment 2 wbrana 2008-10-03 22:47:13 UTC
(In reply to comment #1)
> Dup of PR21485?
> 

PR21485 is ignored by reporter and doesn't have updated summary.
Comment 3 Richard Biener 2008-10-03 23:18:41 UTC

*** This bug has been marked as a duplicate of 21485 ***