This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch ifcvt costs 0/3] Introduce a new target hook for ifcvt costs.


On Tue, Sep 29, 2015 at 11:16:37AM +0100, Richard Biener wrote:
> On Fri, Sep 25, 2015 at 5:04 PM, James Greenhalgh
> <james.greenhalgh@arm.com> wrote:
> > Hi,
> >
> > In relation to the patch I put up for review a few weeks ago to teach
> > RTL if-convert to handle multiple sets in a basic block [1], I was
> > asking about a sensible cost model to use. There was some consensus at
> > Cauldron that what should be done in this situation is to introduce a
> > target hook that delegates answering the question to the target.
> 
> Err - the consensus was to _not_ add gazillion of special target hooks
> but instead enhance what we have with rtx_cost so that passes can
> rely on comparing before and after costs of a sequence of insns.

Ah, I was not able to attend Cauldron this year, so I was trying to pick out
"consensus" from the video. Rewatching it now, I see a better phrase would
be "suggestion with some support".

Watching the video a second time, it seems your proposal is that we improve
the RTX costs infrastructure to handle sequences of Gimple/RTX. That would
get us some way to making a smart decision in if-convert, but I'm not
convinced it allows us to answer the question we are interested in.

We have the rtx for before and after, and we can generate costs for these
sequences. This allows us to calculate some weighted cost of the
instructions based on the calculated probabilities that each block is
executed. However, we are missing information on how expensive the branch
is, and we have no way to get that through an RTX-costs infrastructure.

We could add a hook to give a cost in COSTS_N_INSNS units to a branch based
on its predictability. This is difficult as COSTS_N_INSNS units can differ
depending on whether you are talking about floating-point or integer code.
By this I mean, the compiler considers a SET which costs more than
COSTS_N_INSNS (1) to be "expensive". Consequently, some targets set the cost
of both an integer SET and a floating-point SET to both be COSTS_N_INSNS (1).
In reality, these instructions may have different latency performance
characteristics. What real world quantity are we trying to invoke when we
say a branch costs the same as 3 SET instructions of any type? It certainly
isn't mispredict penalty (likely measured in cycles, not relative to the cost
of a SET instruction, which may well be completely free on modern x86
processors), nor is it the cost of executing the branch instruction which
is often constant to resolve regardless of predicted/mispredicted status.

On the other side of the equation, we want a cost for the converted
sequence. We can build a cost of the generated rtl sequence, but for
targets like AArch64 this is going to be wildly off. AArch64 will expand
(a > b) ? x : y; as a set to the CC register, followed by a conditional
move based on the CC register. Consequently, where we have multiple sets
back to back we end up with:

  set CC (a > b)
  set x1 (CC ? x : y)
  set CC (a > b)
  set x2 (CC ? x : z)
  set CC (a > b)
  set x3 (CC ? x : k)

Which we know will be simplified later to:

  set CC (a > b)
  set x1 (CC ? x : y)
  set x2 (CC ? x : z)
  set x3 (CC ? x : k)

I imagine other targets have something similar in their expansion of
mov<mode>cc (though I haven't looked).

Our comparison for if-conversion then must be:

  weighted_old_cost = (taken_probability * (then_bb_cost)
			- (1 - taken_probability) * (else_bb_cost));
  branch_cost = branch_cost_in_insns (taken_probability)
  weighted_new_cost = redundancy_factor (new_sequence) * seq_cost (new_sequence)

  profitable = weighted_new_cost <= weighted_old_cost + branch_cost

And we must define:

  branch_cost_in_insns (taken_probability)
  redundancy_factor (new_sequence)

At that point, I feel you are better giving the entire sequence to the
target and asking it to implement whatever logic is needed to return a
profitable/unprofitable analysis of the transformation.

The "redundancy_factor" in particular is pretty tough to define in a way
which makes sense outside of if_convert, without adding some pretty
detailed analysis to decide what might or might not be eliminated by
later passes. The alternative is to weight the other side of the equation
by tuning the cost of branch_cost_in_insns high. This only serves to increase
the disconnect between a real-world cost and a number to tweak to game
code generation.

If you have a different way of phrasing the if-conversion question that
avoids the two very specific hooks, I'd be happy to try taking the patches
in that direction. I don't see a way to implement this as just queries to
a costing function which does not need substantial target and pass
dependent tweaking to make behave correctly.

Thanks,
James

> > This patch series introduces that new target hook to provide cost
> > decisions for the RTL ifcvt pass.
> >
> > The idea is to give the target full visibility of the proposed
> > transformation, and allow it to respond as to whether if-conversion in that
> > way is profitable.
> >
> > In order to preserve current behaviour across targets, we will need the
> > default implementation to keep to the strategy of simply comparing branch
> > cost against a magic number. Patch 1/3 performs this refactoring, which is
> > a bit hairy in some corner cases.
> >
> > Patch 2/3 is a simple code move, pulling the definition of the if_info
> > structure used by RTL if-convert in to ifcvt.h where it can be included
> > by targets.
> >
> > Patch 3/3 then introduces the new target hook, with the same default
> > behaviour as was previously in noce_is_profitable_p.
> >
> > The series has been bootstrapped on ARM, AArch64 and x86_64 targets, and
> > I've verified with Spec2000 and Spec2006 runs that there are no code
> > generation differences for any of these three targets after the patch.
> >
> > I also gave ultrasparc3 a quick go, from what I could see, I changed the
> > register allocation for the floating-point condition code registers.
> > Presumably this is a side effect of first constructing RTXen that I then
> > discard. I didn't see anything which looked like more frequent reloads or
> > substantial code generation changes, though I'm not familiar with the
> > intricacies of the Sparc condition registers :).
> >
> > I've included a patch 4/3, to give an example of what a target might want
> > to do with this hook. It needs work for tuning and deciding how the function
> > should actually behave, but works if it is thought of as more of a
> > strawman/prototype than a patch submission.
> >
> > Are parts 1, 2 and 3 OK?
> >
> > Thanks,
> > James
> >
> > [1]: https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00781.html
> >
> > ---
> > [Patch ifcvt 1/3] Factor out cost calculations from noce cases
> >
> > 2015-09-26  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> >         * ifcvt.c (noce_if_info): Add a magic_number field :-(.
> >         (noce_is_profitable_p): New.
> >         (noce_try_store_flag_constants): Move cost calculation
> >         to after sequence generation, factor it out to noce_is_profitable_p.
> >         (noce_try_addcc): Likewise.
> >         (noce_try_store_flag_mask): Likewise.
> >         (noce_try_cmove): Likewise.
> >         (noce_try_cmove_arith): Likewise.
> >         (noce_try_sign_mask): Add comment regarding cost calculations.
> >
> > [Patch ifcvt 2/3] Move noce_if_info in to ifcvt.h
> >
> > 2015-09-26  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> >         * ifcvt.c (noce_if_info): Move to...
> >         * ifcvt.h (noce_if_info): ...Here.
> >
> > [Patch ifcvt 3/3] Create a new target hook for deciding profitability
> >     of noce if-conversion
> >
> > 2015-09-26  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> >         * target.def (costs): New hook vector.
> >         (ifcvt_noce_profitable_p): New hook.
> >         * doc/tm.texi.in: Document it.
> >         * doc/tm.texi: Regenerate.
> >         * targhooks.h (default_ifcvt_noce_profitable_p): New.
> >         * targhooks.c (default_ifcvt_noce_profitable_p): New.
> >         * ifcvt.c (noce_profitable_p): Use new target hook.
> >
> > [Patch Prototype AArch64 ifcvt 4/3] Wire up the new if-convert costs
> >     hook for AArch64
> >
> > 2015-09-26  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> >         * config/aarch64/aarch64.c
> >         (aarch64_additional_branch_cost_for_probability): New.
> >         (aarch64_ifcvt_noce_profitable_p): Likewise.
> >         (TARGET_COSTS_IFCVT_NOCE_PROFITABLE_P): Likewise.
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]