This is the mail archive of the
mailing list for the GCC project.
Re: [patch] Update profile in loop versioning and unrolling
> > > > I wonder where the loop updating logic should possibly requrest
> > > > increasing expected number of iterations in the loop? i.e. within
> > > > inliner, where I can correctly increase maximal frequency by inlining a
> > > > hot loop I care to recompute frequencies overall either from counts or
> > > > probabilities when this happens (at least on IPA branch, on mainline we
> > > > don't have profile at that time yet). I have however dificulties to
> > > > think of ofther places in compiler where we produce new hot stuff,
> > > > possibly with exception of string function expanders (where I had to
> > > > work around this issue too).
> > > >
> > > > I would expect that the scale_bbs_frequencies_int should get large num
> > > > only when den is large too. Perhaps we can even sanity check that the
> > > > function is consistently decreasing the frequencies, or is that too much
> > > > to hope for?
> > >
> > > we used to, but my patch removes this assert. There are two places
> > > where my patch needs to upscale the frequencies (corresponding to the
> > > checks you pointer out):
> > >
> > > 1) In loop unrolling, we upscale the frequencies of the blocks after
> > > the removed loop exits.
> > > 2) At the moment, the estimated profile makes it appear that loops
> > > iterate 10 times. When we unroll such loop e.g. 16 times (which often
> > > happens in prefetching), there would be no way how to make the
> > > profile consistent; to avoid this problem, we upscale the frequencies
> > > of the unrolled loop to make it appear to roll at least 5 times (the
> > > constant 5 being just an arbitrary choice).
> > I see, both those changes should not generally make us to upscale over
> > original maximal frequency of the loop itself, right?
> in the second case, yes (with some care, as you describe below). If the
> profile is inconsistent, this does not have to be true in the first
> case, though.
Hmm, I guess we can make scale_bb_frequencies first downscale the
fraction representation to it's own multiplications don't overflow and
additionally forcingly reudce all frequencies over FREQ_MAX to be
FREQ_MAX so we stay in the safe range (I think also the verification is
good idea, but if you won't do that, I will just drop it into TODO).
One of main points in keeping frequencies within such a limited range
was to let people care less about overflows. I don't want to risk that
we will eventually discover that some of code computing third power of
frequency overflows and start inserting overflow checks in the random
places of copiler.
Can you think of resonable cases where the above change would produce
significantly less realistic profiles than what you do right now?
> > The second one can be dangerous if the loop was predicted to iterate for
> > whatever reason just 2 times and we upscaled unrolled version to 5,
> > multiplying the maximal frequency by over a factor of 2.
> > Can the code be changed to watch the maximal frequency within a loop or
> > possibly just upscale the unrolled loop to iterate same number of times
> > as the original loop?
> > Honza