This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Model of branch penalties
- To: Jan Hubicka <jh at suse dot cz>
- Subject: Model of branch penalties
- From: Jan Hubicka <jh at suse dot cz>
- Date: Sat, 28 Jul 2001 20:47:57 +0200
- Cc: Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org, patches at x86-64 dot org, gcc at gcc dot gnu dot org
- References: <20010727155610.F19763@atrey.karlin.mff.cuni.cz> <20010727175530.D29195@redhat.com> <20010728203359.E25604@atrey.karlin.mff.cuni.cz>
> > It's not handled properly at the moment, but some processors
> > (e.g. Alpha EV6) suffer a several cycle penalty if the target
> > of an indirect jump (e.g. switch statement) is not 16-byte aligned.
> Yes, I was wondering about this case.
> With current target macros we don't have the information.
>
> If you do have idea for good name for such macro, I can add it in
> followup patch.
This opens another issue I was wondered about - how to model properly
the branch costs.
Currently we do have BRANCH_COST with more or less undefined meaning.
For ifcvt, BB duplication, BB reordering and few other passes it should
be worthwhile to use better model.
The paper for BB reordering I was reading used an traveling salesman
solver to find proper ordering of basic blocks. The costs between cities
were penalties of compensation code. They've used alpha machine and
basically took into account the cost of branch instruction itself,
the misfetch penalty (this cost has been accounted on each taken branch)
and misspredict penalty (they've used two different models and dynamic
predicting).
I am not sure if I want to jump into using pseudo-randomized TSP solver
in the gcc, but I can use simple greedy solver on these costs
(such method is reported in different paper to still have significantly
better results than our current one).
Even nowday i386 chips seems to be dificult to describe by the values
described above. Do you have idea what kind of values (and semantics)
we want to use?
Do you have some good examples of obscurities we want to handle?
(one of such is probably the problem many chips have with jumps on
the same cache line)
Honza