This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: x86 branch cost


Marek,
The whole gcc understanding of BRANCH_COST is quite ill especially when dealing
with i386 platform.  For instance on K6 all conditional move equivalents (such
as sbb and setcc sequences) are expensive, so you usually win by disabling them
using BRANCH_COST of 0. On the other hand there are trasformations (such as
converting abs to branchless code) that are very benefical for K6 because
integer arithmetic and shifting is fast.

I was thinking about some more sensible replacement for the BRANCH_COST macro,
but so far don´t have much ideas how to do that except of mask with all
branch optimizations performed by gcc that would give you fine grained
control on the one side but will make hard to add new optimization on the
other.

My decision for now is to wait some time until the gcc code for branch
optimizations stabilize somehow and then start thinking about the sollution.

Of course all ideas are welcomed.

Pgcc is having BRANCH_COST with extra paremeters specifying direction 
(forward/backward) and taken/not taken. Branches on many CPUs have cost
dependent on that parameters.
Perhaps we our new static branch prediction code we can update BRANCH_COST
macro to get the probability and direction.  Perhaps this can help to
improve situation somehow, but still not very well defined values of
BRANCH_COST remain IMO the main problem.

And yes, especially for PPro platform I guess it is good idea to increast
BRANCH_COST somehow. The misspredicted branches are very expensive.
But some benchmarking will be probably necesary.  Note that simple benchmark
app aren´t good measure for this code, since these contains usually only
the well predicable branches.  Some of more complex real world aplications
needs to be used to show real ration of the code size increase versus
branch misspredicts.

Last issue is dealing with code size with -Os.  I guess this is best
approximated by generic code that ought to avoid optimizations that usually
increase code size. i386 backend will need changes to cmove expander to
do same trick.
Honza

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]