This is the mail archive of the
mailing list for the GCC project.
Re: Incorrect application of loop exit heuristic?
> I was looking in to a degradation for perlbmk on PowerPC and tracked it
> down to a mispredicted branch within a loop ( if (...) return 0; within
> the loop). GCC is statically predicting the loop exit as not taken "bne-",
> but it is obviously being taken the greatest share of the time because when
> I remove the prediction "-", the degradation disappears.
> I'm wondering if the heuristic for a loop exit is being applied
> correctly. predict.def defines PRED_LOOP_EXIT with a hitrate of 90,
> meaning we'll only exit the loop 10% of the time. But when this heuristic
> is applied to the CFG in predict.c:predict_loops(), it is dividing that 10%
> by the number of exits in the loop and then assigning that probability. In
> the case I was looking at, a probability of 0.2% ended up being assigned
> (quite a few exits in the loop), which then resulted in the branch being
> statically predicted because it crossed the 98%/2% threshold where branches
> are predicted. Should the 10% probability be applied without dividing by
> the number of exits (i.e. each exit has a 10% probability of being taken,
> independent of other loop exits)? The way things are now, once we get more
> than 5 exits in a loop those exits may be statically predicted as not taken
> since their probabilities will be less than the 2% threshold.
> The benchmark code in question was a fairly large switch statement within a
> loop, having several 'return' statements in the various 'case' legs.
The code there is basically avoiding loops with many exists to be
predicted to not loop at all (ie if you have 10 exits, having every exit
with 10% probability is going to make 0.9^10 (roughly 30%) probability
that the loop will trip.
The code is not quite correct - first it should be computing -10th power
of the probability (that is close enough to division) and also it should
track down number of exist conditionals executed each iteration...
Would be possible to benchmark the SPEC without this division for start?
How many exit conditionals are executed per iteration in your loop?