This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Experimental results of the branch predictors



Hi
Here are some experimental results of our branch predictor.  I am not sure that
my code to calculate the statistics is correct.  It works for trivial
testcases, but in complex cases thinks may or may not work correctly.  On the
other hand, results looks relativly sane and some of them match with results
published in papers I have.

I've run tests on byte benchmark and gcc itself (as two interesting extreme
cases).   It would be interesting to run tests on specs, since these are
published in the papers, but I am still not having the licence ready.

To read results you need to know, that "first match" is heruistics we use right
now that use first heruistics matching the branch. "combined" is heruistics
combining probabilities from all matching heruistics into single probability
as suggested in Wu's paper.  "note (real or user defined)" is perfect predictor
comming from profiled data.

Branches stays for number of branches predicted, coverage number of executions
of those branches.  Hitrate is percentage about how often is branch
misspredicted.

GCC results:

HERUISTICS                       BRANCHES  (REL)  HITRATE   COVERAGE    (REL)
pointer                             11746  10.1%  70.540%   4723122384  13.1%
opcode eq                           51080  44.1%  52.160%  13362320414  37.2%
first match                        115801 100.0%  64.733%  35930444523 100.0%
combined                           115801 100.0%  65.426%  35930444523 100.0%
note (real or user defined)         78440  67.7%  88.853%  35930444523 100.0%
opcode ne                           44931  38.8%  70.703%  12340509589  34.3%
exit block                           2686   2.3% 100.000%    278805145   0.8%
loop exit                           19474  16.8%  83.106%  12677792701  35.3%
no predictions heruistics           10458   9.0%  63.021%   3075306846   8.6%
loop header                          8642   7.5%  44.521%   3505449773   9.8%
opcode negative                      1284   1.1%  70.716%    553650027   1.5%
loop branch                          2106   1.8%  69.127%   1664294927   4.6%
opcode positive                      1691   1.5%  74.799%   1047502098   2.9%

Byte benchmark results:

HERUISTICS                       BRANCHES  (REL)  HITRATE   COVERAGE    (REL)
loop header                           441  33.3%  51.871%   1285342853   4.9%
opcode eq                             363  27.4%  63.220%   5370714294  20.6%
first match                          1325 100.0%  81.982%  26061522078 100.0%
combined                             1325 100.0%  82.067%  26061522078 100.0%
note (real or user defined)          1180  89.1%  88.776%  26061522078 100.0%
no predictions heruistics             111   8.4%  66.035%   3670810239  14.1%
loop branch                           159  12.0%  98.252%   3104056913  11.9%
loop exit                             559  42.2%  92.735%  10663118363  40.9%
opcode ne                             306  23.1%  93.899%   9836784684  37.7%
opcode positive                        83   6.3%  66.290%   1759830260   6.8%
opcode negative                        11   0.8% 100.000%       425009   0.0%
exit block                              1   0.1%       0%            0   0.0%
pointer                                 9   0.7% 100.000%           15   0.0%


There are some expectable thinks, such as better prediction in byte benchmark case,
since it contains more and longer loops.  Also some results match Ball Larus paper
closely (pointer heruistics, loop, opcode) in both coverage and hitrate. So I tend
to believe that the results calculated are quite correct.

There was some surprises to me:
  1) nonpredicted branches are taken only with 37% probability.  This result seems
  to be same in other tests as well and suggest strongly against the random ordering.
  I've modified our predictor to exploit this and got improvement from 62% to 78%
  in byte benchmarks (and similar in other cases too).

  I am not sure how to handle this interesting information.  Some heruistics with
  lower hitrate than (such as opcode eq, loop header) are actually wastefull,
  since they disable the non-heruistics heruistics.

  I guess that the low probability of taking the branch is because we predict
  accurately the loop edges and other edges are less probable.
  One of interesting ways is to add "non-loop" heruistics that will predict all
  non-loop branches as not taken with some probability, or generalize the idea
  to calculate average hitrates of branches that don't match given subsets
  of heruistics and add them as new complementary heruistics.

  2) low hitrate of opcode equality heruistics compared to opcode nonequality.
  Most papers covers this by united heruistics, so I can't say whether someone else hit
  similar case.  I've seen paper suggesting building of table based on comparison
  operator and operand type, maybe this is good idea.

  Interesting is that the heruistics does exactly the same as no prediction
  heruistics, so it tends to find badly predicable cases quite easilly.

  3) Low hitrate of loop header heruistics (expecting the jump around the loop
  is not taken, since it most probably comes from while to do-while loop
  converison), that is claimed to perform much better in the Wu's paper.

Hope you found this interesting.

Honza

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]