[Bug tree-optimization/42216] [4.5 Regression] rev 154688 regress 464.h264ref peak 20%

rguenth at gcc dot gnu dot org gcc-bugzilla@gcc.gnu.org
Tue Dec 1 16:46:00 GMT 2009



------- Comment #7 from rguenth at gcc dot gnu dot org  2009-12-01 16:46 -------
Just reverting rev. 154688 and using the training set gets us from

464.h264ref        --        228         -- S

to

464.h264ref        --        170         -- S                                  

at -O3 -ffast-math -funroll-loops -march-native (-march=k8-sse3 -msahf --param
l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512
-mtune=k8).

After the patch the oprofile looks like

CPU: AMD64 processors, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 100000
samples  %        symbol name
2241755  51.1739  SetupFastFullPelSearch
442008   10.0900  BlockMotionSearch
271429    6.1961  SubPelBlockMotionSearch
269337    6.1483  FastPelY_14
170961    3.9026  UMVLine16Y_11
159311    3.6367  SetupLargerBlocks
155556    3.5510  SATD
127230    2.9044  FastLine16Y_11
72328     1.6511  dct_luma
69728     1.5917  getNonAffNeighbour

All but the *Y_1[14] functions are inside mv-search.c, just re-compiling
that file is enough to reproduce the issue.

after reverting it

CPU: AMD64 processors, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 100000
samples  %        symbol name
1223269  36.8289  SetupFastFullPelSearch
450306   13.5573  BlockMotionSearch
320395    9.6461  SubPelBlockMotionSearch
245175    7.3815  FastPelY_14
160694    4.8380  SetupLargerBlocks
155353    4.6772  SATD
153292    4.6152  UMVLine16Y_11
79531     2.3944  FastLine16Y_11
72455     2.1814  dct_luma
70733     2.1296  getNonAffNeighbour
42828     1.2894  UMVPelY_14
34396     1.0356  UnifiedOneForthPix


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.5 Regression] rev        |[4.5 Regression] rev 154688
                   |15458[78] regress           |regress 464.h264ref peak 20%
                   |464.h264ref peak 20%        |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42216



More information about the Gcc-bugs mailing list