This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Experimental results of the branch predictors
- To: egcs at egcs dot cygnus dot com
- Subject: Experimental results of the branch predictors
- From: Jan Hubicka <jh at suse dot cz>
- Date: Sun, 14 May 2000 22:14:13 +0200
Hi
Here are some experimental results of our branch predictor. I am not sure that
my code to calculate the statistics is correct. It works for trivial
testcases, but in complex cases thinks may or may not work correctly. On the
other hand, results looks relativly sane and some of them match with results
published in papers I have.
I've run tests on byte benchmark and gcc itself (as two interesting extreme
cases). It would be interesting to run tests on specs, since these are
published in the papers, but I am still not having the licence ready.
To read results you need to know, that "first match" is heruistics we use right
now that use first heruistics matching the branch. "combined" is heruistics
combining probabilities from all matching heruistics into single probability
as suggested in Wu's paper. "note (real or user defined)" is perfect predictor
comming from profiled data.
Branches stays for number of branches predicted, coverage number of executions
of those branches. Hitrate is percentage about how often is branch
misspredicted.
GCC results:
HERUISTICS BRANCHES (REL) HITRATE COVERAGE (REL)
pointer 11746 10.1% 70.540% 4723122384 13.1%
opcode eq 51080 44.1% 52.160% 13362320414 37.2%
first match 115801 100.0% 64.733% 35930444523 100.0%
combined 115801 100.0% 65.426% 35930444523 100.0%
note (real or user defined) 78440 67.7% 88.853% 35930444523 100.0%
opcode ne 44931 38.8% 70.703% 12340509589 34.3%
exit block 2686 2.3% 100.000% 278805145 0.8%
loop exit 19474 16.8% 83.106% 12677792701 35.3%
no predictions heruistics 10458 9.0% 63.021% 3075306846 8.6%
loop header 8642 7.5% 44.521% 3505449773 9.8%
opcode negative 1284 1.1% 70.716% 553650027 1.5%
loop branch 2106 1.8% 69.127% 1664294927 4.6%
opcode positive 1691 1.5% 74.799% 1047502098 2.9%
Byte benchmark results:
HERUISTICS BRANCHES (REL) HITRATE COVERAGE (REL)
loop header 441 33.3% 51.871% 1285342853 4.9%
opcode eq 363 27.4% 63.220% 5370714294 20.6%
first match 1325 100.0% 81.982% 26061522078 100.0%
combined 1325 100.0% 82.067% 26061522078 100.0%
note (real or user defined) 1180 89.1% 88.776% 26061522078 100.0%
no predictions heruistics 111 8.4% 66.035% 3670810239 14.1%
loop branch 159 12.0% 98.252% 3104056913 11.9%
loop exit 559 42.2% 92.735% 10663118363 40.9%
opcode ne 306 23.1% 93.899% 9836784684 37.7%
opcode positive 83 6.3% 66.290% 1759830260 6.8%
opcode negative 11 0.8% 100.000% 425009 0.0%
exit block 1 0.1% 0% 0 0.0%
pointer 9 0.7% 100.000% 15 0.0%
There are some expectable thinks, such as better prediction in byte benchmark case,
since it contains more and longer loops. Also some results match Ball Larus paper
closely (pointer heruistics, loop, opcode) in both coverage and hitrate. So I tend
to believe that the results calculated are quite correct.
There was some surprises to me:
1) nonpredicted branches are taken only with 37% probability. This result seems
to be same in other tests as well and suggest strongly against the random ordering.
I've modified our predictor to exploit this and got improvement from 62% to 78%
in byte benchmarks (and similar in other cases too).
I am not sure how to handle this interesting information. Some heruistics with
lower hitrate than (such as opcode eq, loop header) are actually wastefull,
since they disable the non-heruistics heruistics.
I guess that the low probability of taking the branch is because we predict
accurately the loop edges and other edges are less probable.
One of interesting ways is to add "non-loop" heruistics that will predict all
non-loop branches as not taken with some probability, or generalize the idea
to calculate average hitrates of branches that don't match given subsets
of heruistics and add them as new complementary heruistics.
2) low hitrate of opcode equality heruistics compared to opcode nonequality.
Most papers covers this by united heruistics, so I can't say whether someone else hit
similar case. I've seen paper suggesting building of table based on comparison
operator and operand type, maybe this is good idea.
Interesting is that the heruistics does exactly the same as no prediction
heruistics, so it tends to find badly predicable cases quite easilly.
3) Low hitrate of loop header heruistics (expecting the jump around the loop
is not taken, since it most probably comes from while to do-while loop
converison), that is claimed to perform much better in the Wu's paper.
Hope you found this interesting.
Honza