This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Update probabilities in predict.def to match reality

From: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
To: Jan Hubicka <hubicka at ucw dot cz>, gcc-patches at gcc dot gnu dot org
Date: Mon, 13 Jun 2016 17:35:22 +0100
Subject: Re: Update probabilities in predict.def to match reality
Authentication-results: sourceware.org; auth=none
References: <20160607192704 dot GB22955 at kam dot mff dot cuni dot cz>

Hi Honza,

On 07/06/16 20:27, Jan Hubicka wrote:

Hello,
Maritn Liska measured branch predictor hitrates on current tree and SPEC2006.

CPU2006
HEURISTICS                           BRANCHES  (REL)  HITRATE                COVERAGE COVERAGE  (REL)
loop iv compare                            33   0.1%  20.27% /  86.24%       30630826   30.63M   0.0%
no prediction                           10406  19.5%  33.41% /  84.76%   139755242456  139.76G  14.1%
early return (on trees)                  6328  11.9%  54.20% /  86.48%    33569991740   33.57G   3.4%
guessed loop iterations                   112   0.2%  62.06% /  64.49%      958458522  958.46M   0.1%
fail alloc                                595   1.1%  62.18% / 100.00%            595   595.00   0.0%
opcode values positive (on trees)        4266   8.0%  64.30% /  91.28%    16931889792   16.93G   1.7%
opcode values nonequal (on trees)        6600  12.4%  66.23% /  80.60%    71483051282   71.48G   7.2%
continue                                  507   0.9%  66.66% /  82.85%    10086808016   10.09G   1.0%
call                                    11351  21.3%  67.16% /  92.24%    34680666103   34.68G   3.5%
loop iterations                          2689   5.0%  67.99% /  67.99%   408309517405  408.31G  41.3%
DS theory                               26385  49.4%  68.62% /  85.44%   146974369890  146.97G  14.9%
const return                              271   0.5%  69.39% /  87.09%      301566712  301.57M   0.0%
pointer (on trees)                       6230  11.7%  69.59% /  87.18%    16667735314   16.67G   1.7%
combined                                53398 100.0%  70.31% /  80.36%   989164856862  989.16G 100.0%
goto                                       78   0.1%  70.36% /  96.96%      951041538  951.04M   0.1%
first match                             16607  31.1%  78.00% /  78.42%   702435244516  702.44G  71.0%
extra loop exit                           141   0.3%  82.80% /  88.17%     1696946942    1.70G   0.2%
null return                               393   0.7%  91.47% /  93.08%     3268678197    3.27G   0.3%
loop exit                                9909  18.6%  91.80% /  92.81%   282927773783  282.93G  28.6%
guess loop iv compare                     178   0.3%  97.81% /  97.85%     4375086453    4.38G   0.4%
negative return                           277   0.5%  97.94% /  99.23%     1062119028    1.06G   0.1%
noreturn call                            2372   4.4% 100.00% / 100.00%     8356562323    8.36G   0.8%
overflow                                 1282   2.4% 100.00% / 100.00%      175074177  175.07M   0.0%
zero-sized array                          677   1.3% 100.00% / 100.00%      112723803  112.72M   0.0%
unconditional jump                        103   0.2% 100.00% / 100.00%         491001  491.00K   0.0%

We used to track SPEC2000 until 2008 but then the infrastructure broke. The
numbers show some differences to 2008 results:

HEURISTICS                 BRANCHES  (REL)  HITRATE              COVERAGE  (REL)
DS theory                     42611  57.1%  74.54% /  89.71%   9237799352  28.7%
combined                      74578 100.0%  72.88% /  90.59%  32201983315 100.0%
opcode values nonequal (on trees)    14544  19.5%  72.03% /  88.64%   3387233627  10.5%
early return (on trees)       11078  14.9%  61.23% /  89.25%   2349499033   7.3%
first match                   13249  17.8%  89.11% /  93.08%  15876522911  49.3%
guessed loop iterations        2722   3.6%  86.50% /  90.76%   7308035517  22.7%
no prediction                 18718  25.1%  34.36% /  86.14%   7087661052  22.0%
call                          23937  32.1%  71.38% /  93.08%   3829002205  11.9%
opcode values positive (on trees)     2515   3.4%  72.77% /  86.49%    927995806   2.9%
loop branch                     378   0.5%  87.61% /  95.54%   1491510452   4.6%
loop exit                      8833  11.8%  91.43% /  94.52%   6538486043  20.3%
loop iterations                 912   1.2%  99.11% /  99.11%    396451321   1.2%
noreturn call                   890   1.2%  99.99% /  99.99%    205957905   0.6%
pointer (on trees)             8394  11.3%  85.09% /  94.80%   1315262058   4.1%
negative return                 272   0.4%  96.47% /  99.74%     49156319   0.1%
const return                    551   0.7%  67.92% /  68.97%     96082001   0.3%
__builtin_expect                 20   0.0%      0% /      0%            0   0.0%
null return                     566   0.8%  96.58% /  98.77%     87555632   0.3%

There is some degradation in the combined heuristicshitrate (72.8->70) which may be caused
simply by fact that new spec is harder to guess. Main decrease seems to be in opcode_positive/nonequal
which may be also attributed to the fact that early opts now optimize out more code before
we do the statistics.

There are bugs in few predictors - goto predictor is dead because the FE code was dropped,
return predictor is bit random because CFG is optimized (it should probably be done in FE),
loop iv compare seems bogus and fortran fail alloc does not seem to work as intended.
I added FIXME and will addres them incrementally.

Bootstrapped/regtested x86_64-linux, will commit it later today.

Honza

	* predict.c (predict_iv_comparison): Mention that heuristics is broken.
	(return_prediction): PRED_CONST_RETURN predict return as not taken.

	* predict.def (PRED_CONTINUE): Change hitrate 50->67
	(PRED_LOOP_BRANCH): Document predictor as broken.
	(PRED_LOOP_EXIT): Change hitrate 91->92.
	(PRED_LOOP_EXTRA_EXIT): Change hitrate 91->83.
	(PRED_POINTER, PRED_TREE_POINTER): Change hitrate 85->70.
	(PRED_OPCODE_POSITIVE): Change hitrate 79->64.
	(PRED_OPCODE_NONEQUAL): Change hitrate 91->66.
	(PRED_TREE_OPCODE_POSITIVE): Change hitrate 73->64
	(PRED_TREE_OPCODE_NONEQUAL): Chnage hitrate 72->66
	(PRED_CALL): Chane hitrate 71->67.
	(PRED_TREE_EARLY_RETURN): Document issues, change hitrate 61->54.
	(PRED_GOTO): Document as unused right now.
	(PRED_CONST_RETURN): Change hitrate 67->69
	(PRED_NEGATIVE_RETURN): Change hitrate 96->98
	(PRED_NULL_RETURN): Change hitrate 91->90.
	(PRED_LOOP_IV_COMPARE_GUESS): Change hitrate to 98.
	(PRED_FORTRAN_FAIL_ALLOC): Change hitrate to 62; document issues.
	(PRED_FORTRAN_SIZE_ZERO): Change hitrate to 99.


In the testsuite I'm seeing:
ERROR: gcc.dg/tree-ssa/attr-hotcold-2.c: error executing dg-final: syntax error in target selector "profile_estimate"

on aarch64-none-elf.
I think the hunk:
-/* { dg-final { scan-ipa-dump-times "block 4, loop depth 0, count 0, freq 1\[^0-9\]" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times 1 "hot label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times 1 "cold label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0, freq \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */

is buggy, should it be
-/* { dg-final { scan-ipa-dump-times "block 4, loop depth 0, count 0, freq 1\[^0-9\]" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "hot label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "cold label heuristics" 1 "profile_estimate" } } */
+/* { dg-final { scan-tree-dump-times "block 4, loop depth 0, count 0, freq \[1-4\]\[^0-9\]" 1 "profile_estimate" } } */
?

With that change the test runs but still FAILs:
FAIL: gcc.dg/tree-ssa/attr-hotcold-2.c scan-tree-dump-times profile_estimate "block 4, loop depth 0, count 0, freq [1-4][^0-9]" 1
FAIL: gcc.dg/tree-ssa/attr-hotcold-2.c scan-tree-dump-times profile_estimate "block 5, loop depth 0, count 0, freq [6-9][0-9][0-9][0-9]" 1

Thanks,
Kyrill

Follow-Ups:
- [PATCH]Fix scan-tree-dump-times syntax errors in gcc.dg/tree-ssa/attr-hotcold-2.c
  - From: Renlin Li

References:
- Update probabilities in predict.def to match reality
  - From: Jan Hubicka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]