[RFC: Patch 1/6 v2] New target hook: max_noce_ifcvt_seq_cost

Bernhard Reutner-Fischer rep.dot.nop@gmail.com
Tue Jun 21 21:31:00 GMT 2016

On June 21, 2016 5:50:26 PM GMT+02:00, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>On Fri, Jun 03, 2016 at 12:39:42PM +0200, Richard Biener wrote:
>> On Thu, Jun 2, 2016 at 6:53 PM, James Greenhalgh
>> <james.greenhalgh@arm.com> wrote:
>> >
>> > Hi,
>> >
>> > This patch introduces a new target hook, to be used like
>> > with a guaranteed unit of measurement. We want this to break away
>> > the current ambiguous uses of BRANCH_COST.
>> >
>> > BRANCH_COST is used in ifcvt.c in two types of comparisons. One
>> > instruction counts - where it is used as the limit on the number of
>> > instructions we are permitted to generate. The other (after
>> > by COSTS_N_INSNS (1)) directly against RTX costs.
>> >
>> > Of these, a comparison against RTX costs is the more easily
>> > metric across the compiler, and the one I've pulled out to the new
>> > To keep things consistent for targets which don't migrate, this new
>> > has a default value of BRANCH_COST * COSTS_N_INSNS (1).
>> >
>> > OK?
>> How does the caller compute "predictable"?  There are some archs
>> an information on whether this is a forward or backward jump is more
>> useful I guess.  Also at least for !speed_p the distance of the
>branch is
>> important given not all targets support arbitrary branch offsets.
>Just through a call to predictable_edge_p. It isn't perfect. My worry
>with adding more details of the branch is that you end up with a
>target implementation that tries way too hard to be clever. But, I
>mind passing the edge through to the target hook, that way a target has
>it if they want it. In this patch revision, I pass the edge through.
>> I remember that at the last Cauldron we discussed to change things to
>> compare costs of sequences of instructions rather than giving targets
>> context with just asking for single (sub-)insn rtx costs.
>I've made better use of seq_cost in this respin. Bernd was right,
>constructing dummy RTX just for costs, then discarding it, then
>constructing the actual RTX for matching doesn't make sense as a
>Better just to construct the real sequence and use the cost of that.
>In this patch revision, I started by removing the idea that this costs
>a branch at all. It doesn't, the use of this hook is really a target
>trying to limit if-convert to not end up pulling too much on to the
>unconditional path. It seems better to expose that limit directly by
>explicitly asking for the maximum cost of an unconditional sequence we
>would create, and comparing against seq_cost of the new RTL. This saves
>a target trying to figure out what is meant by a cost of a branch.
>Having done that, I think I can see a clearer path to getting the
>default hook implementation in shape. I've introduced two new params,
>which give maximum costs for the generated sequence (one for a
>branch, one for "unpredictable") in the speed_p cases. I'm not
>expecting it
>to be useful to give the user control in the case we are compiling for
>size - whether this is a size win or not is independent of whether the
>branch is predictable.
>For the default implementation, if the parameters are not set, I just
>multiply BRANCH_COST through by COSTS_N_INSNS (1) for size and
>COSTS_N_INSNS (3) for speed. I know this is not ideal, but I'm still
>of ideas on how best to form the default implementation.

How bad is it in e.g. CSiBE?

>still potentially going to introduce performance regressions for
>that don't provide an implementation of the new hook, or a default
>for the new parameters. It does mean we can keep the testsuite clean by
>setting parameter values suitably high for all targets that have
>conditional move instructions.
>The new default causes some changes in generated conditional move
>for x86_64. Whether these changes are for the better or not I can't
>This first patch introduces the two new parameters, and uses them in
>default implementation of the target hook.

s/precitable/predictable/ ?

Present tense in documentation (s/will try to/tries to/).
s/should return/returns/

TARGET_MAX_NOCE_IFCVT_SEQ_COST (bool @var{speed_p}, edge @var{e}) talks about predictable_p but doesn't document e.

+DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST, +	 "max-rtl-if-conversion-unpredictable-cost", +	 "Maximum permissible cost for the sequence that would be " +	 "generated by the RTL if-conversion pass for a branch which " +	 "is considered predictable.", +	 40, 0, 200)


Present tense also in target.def.

+@code{predictable_p} is true

no predictable_p anymore but e missing in docs.

/Then multiply through by/s/through by/with/

>Bootstrapped on x86_64 and aarch64 with no issues.
>2016-06-21  James Greenhalgh  <james.greenhalgh@arm.com>
>	* target.def (max_noce_ifcvt_seq_cost): New.
>	* doc/tm.texi.in (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Document it.
>	* doc/tm.texi: Regenerate.
>	* targhooks.h (default_max_noce_ifcvt_seq_cost): New.
>	* targhooks.c (default_max_noce_ifcvt_seq_cost): New.
>	* doc/invoke.texi: Document new params.

More information about the Gcc-patches mailing list