This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC: Patch 0/6] Rewrite the noce-ifcvt cost models


On Thu, Jun 09, 2016 at 10:58:52AM -0600, Jeff Law wrote:
> On 06/02/2016 10:53 AM, James Greenhalgh wrote:
> >Hi,
> >
> >When I was working in the area last year, I promised to revisit the cost
> >model for noce if-conversion and see if I could improve the modeling. This
> >turned out to be more tricky than I expected.
> >
> >This patch set rewrites the cost model for noce if-conversion. The goal is
> >to rationalise the units used in the calculations away from BRANCH_COST,
> >which is only defined relative to itself.
> Right.  I think we all agreed that the key weakness of BRANCH_COST
> was that its meaning is only defined relative to itself.
> 
> What we want is a costing metric that would allow us to estimate the
> cost of different forms of the computation, which might include
> branches and which may include edge probabilty information.
> 
> >
> >If you're looking at that and thinking it doesn't sound much different from
> >our current call to BRANCH_COST, you're right. This isn't as large a
> >departure from the existing cost model as I had originally intended.
> Perhaps not as large of a change as you intended, but I think you're
> hitting the key issue with BRANCH_COST.
> 
> 
> >This act of making the cost models consistent will cause code generation
> >changes on a number of targets - most notably x86_64. On x86_64 the RTX
> >cost of a conditional move comes out at "20" - this is far higher than
> >COSTS_N_INSNS (BRANCH_COST) for the x86 targets, so they lose lots
> >of if-conversion. The easy fix for this would be to implement the new hook.
> >I measured the performance impact on Spec2000 as a smoke test, it didn't
> >seem to harm anything, and the result was a slight (< 3%) uplift on
> >Spec2000FP. I'm no expert on x86_64, so I haven't taken a closer look for
> >the reasons.
> I'd be comfortable with Uros guiding the implementation of the
> target hook for x86 so that we don't take a major step backward.

The trouble I'm having with all targets is noce_try_cmove_arith and the
testcases added as part of that patch set.

The current cost model for noce_try_cmove_arith doesn't take in to
consideration the cost of a conditional move at all, it just checks the
cost of each branch, takes the maximum of that, and compares it against
COSTS_N_INSNS (BRANCH_COST).

As we move to my patch set, I want to compute the cost of the whole
if-converted sequence, which is going to include the cost of a conditional
move. On x86_64, the total cost for a simple conditional move
is COSTS_N_INSNS (5). This alone would prevent if-conversion for most
x86_64 subtargets, once you add in the max cost of one of the two branches,
you guarantee that no x86_64 target will be converting through
noce_try_cmove_arith.

>From the test results I'm seeing, this holds true for other targets which
show regressions with the new cost models. Clearly my idea for the default
hook implementation is not going to fly. If more targets had implemented the
PARAM_MAX_RTL_IF_CONVERSION_INSNS hook, I could use that to guide the
default implementation, but that is currently x86_64 only. I could add a
multiplier against BRANCH_COST in the new hook, but then we'd be guaranteeing
that the "cheap" if-conversions, where we use STORE_FLAG_VALUE rather than
introducing a conditional move, always fired unless the target had a
branch_cost of 0 (this might not be a bad model actually...). Finding what
this multiplier should be will be tough, as it depends directly on the cost
of a conditional move, which targets don't expose easily, and which I don't
want to construct junk conditional moves just to find through rtx_cost.

I'll keep giving it some thought, and I'd appreciate any suggestions for the
default hook implementation.

I've respun the patch set around Bernd's suggestion, and now that we don't
check the cost model until after constructing the new sequence the patch set
looks much nicer. I'll wait a bit before sending the respin out in the hope
that I'll have a good idea for the default hook implementation to reduce
the number of performance changes I'll introduce.

Thanks,
James


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]