[RFC: Patch 1/6 v2] New target hook: max_noce_ifcvt_seq_cost

Wed Jul 13 21:16:00 GMT 2016

On 06/21/2016 09:50 AM, James Greenhalgh wrote:
>
> On Fri, Jun 03, 2016 at 12:39:42PM +0200, Richard Biener wrote:
>> On Thu, Jun 2, 2016 at 6:53 PM, James Greenhalgh
>> <james.greenhalgh@arm.com> wrote:
>>>
>>> Hi,
>>>
>>> This patch introduces a new target hook, to be used like BRANCH_COST but
>>> with a guaranteed unit of measurement. We want this to break away from
>>> the current ambiguous uses of BRANCH_COST.
>>>
>>> BRANCH_COST is used in ifcvt.c in two types of comparisons. One against
>>> instruction counts - where it is used as the limit on the number of new
>>> instructions we are permitted to generate. The other (after multiplying
>>> by COSTS_N_INSNS (1)) directly against RTX costs.
>>>
>>> Of these, a comparison against RTX costs is the more easily understood
>>> metric across the compiler, and the one I've pulled out to the new hook.
>>> To keep things consistent for targets which don't migrate, this new hook
>>> has a default value of BRANCH_COST * COSTS_N_INSNS (1).
>>>
>>> OK?
>>
>> How does the caller compute "predictable"?  There are some archs where
>> an information on whether this is a forward or backward jump is more
>> useful I guess.  Also at least for !speed_p the distance of the branch is
>> important given not all targets support arbitrary branch offsets.
>
> Just through a call to predictable_edge_p. It isn't perfect. My worry
> with adding more details of the branch is that you end up with a nonsense
> target implementation that tries way too hard to be clever. But, I don't
> mind passing the edge through to the target hook, that way a target has
> it if they want it. In this patch revision, I pass the edge through.
There are so many things that can factor into this decision.  But I 
suspect we get most of the benefit from a small amount of work (ie, 
using the prediction information we've already generated).  If the 
target wants to override based on other factors, there's a mechanism for 
that, but I don't think it's likely to be heavily, if at all, used.

> In this patch revision, I started by removing the idea that this costs
> a branch at all. It doesn't, the use of this hook is really a target
> trying to limit if-convert to not end up pulling too much on to the
> unconditional path. It seems better to expose that limit directly by
> explicitly asking for the maximum cost of an unconditional sequence we
> would create, and comparing against seq_cost of the new RTL. This saves
> a target trying to figure out what is meant by a cost of a branch.
Seems sensible.  Essentially you're just asking a different but related 
(and more direct) question that side-steps the need to be able to 
compare branch cost with costs of other insns.

A target maintainer may have a sense of how branch cost relates to 
normal insns on their target and they may choose to define the hooks in 
those terms (though I would discourage that as it re-introduces the 
precise problem we're trying to get around).

>
> Having done that, I think I can see a clearer path to getting the
> default hook implementation in shape. I've introduced two new params,
> which give maximum costs for the generated sequence (one for a "predictable"
> branch, one for "unpredictable") in the speed_p cases. I'm not expecting it
> to be useful to give the user control in the case we are compiling for
> size - whether this is a size win or not is independent of whether the
> branch is predictable.
Maybe not for MIPS :-)  But I think that should be tabled.

>
> For the default implementation, if the parameters are not set, I just
> multiply BRANCH_COST through by COSTS_N_INSNS (1) for size and
> COSTS_N_INSNS (3) for speed. I know this is not ideal, but I'm still short
> of ideas on how best to form the default implementation. This means we're
> still potentially going to introduce performance regressions for targets
> that don't provide an implementation of the new hook, or a default value
> for the new parameters. It does mean we can keep the testsuite clean by
> setting parameter values suitably high for all targets that have
> conditional move instructions.
But I think that's an OK place to be -- I don't think it's sensible for 
you to have to figure that out for every target.  It's something the 
target maintainers ought to be able to guesstimate more accurately.  My 
only objection is conceptual based on mixing BRANCH_COST & 
COSTS_N_INSNS.  But there may simply not be another way to set the default.

>
> The new default causes some changes in generated conditional move sequences
> for x86_64. Whether these changes are for the better or not I can't say.
You might consider passing those along to Uros to get his opinion.

>
> This first patch introduces the two new parameters, and uses them in the
> default implementation of the target hook.
>
> Bootstrapped on x86_64 and aarch64 with no issues.
>
> OK?
>
> Thanks,
> James
>
> ---
> 2016-06-21  James Greenhalgh  <james.greenhalgh@arm.com>
>
> 	* target.def (max_noce_ifcvt_seq_cost): New.
> 	* doc/tm.texi.in (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Document it.
> 	* doc/tm.texi: Regenerate.
> 	* targhooks.h (default_max_noce_ifcvt_seq_cost): New.
> 	* targhooks.c (default_max_noce_ifcvt_seq_cost): New.
> 	* params.def (PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST): New.
> 	(PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST): Likewise.
> 	* doc/invoke.texi: Document new params.
This is fine IMHO.

jeff