This is the mail archive of the
mailing list for the GCC project.
Re: limit code expansion on ifcvt - bbro too?
> On Wed, Nov 20, 2002 at 06:34:29PM -0800, tm wrote:
> > The problem is that BBRO only seems applicable to processors whith the
> > following characteristics:
> The problem is that you're only thinking of BBRO in its current
> form, in which it tries to optimize a straight-line fast path.
> It is a generic framework that can lay out blocks in *any* order.
> All that is required is that you come up with a metric to choose
> that order.
> One metric might apply to SH, which has relatively short "short"
> branches. You could arrange code to minimize the number of long
> branches even if that increased the number of short branches.
Hope Josef will send his implementation soon. It is not specialized for
short branches, but it naturally splits function into regions of
different frequencies that should result in the branch shortening too.
> > 1) High branch penalty
> > 2) Weak/no branch prediction logic
> (3) Improve Icache bandwidth usage.
(4) Improve decode bandwidth by limited amount of taken branches
Particularry important for K6, Athlon and I believe PPro too.
Even with current algorithm the amount of taken branches decreases
> Nr 3 is particularly relevant for error reporting code which you
> can know for absolute certain will never be executed in a healthy
> run of the application.
> Depending on the application, this can make a fantastic difference.
> I heard a number like 20% improvement for Oracle when Sun
> implemented this (with profile feedback) in SunPRO.