This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH GCC8][29/33]New register pressure estimation
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 15 May 2017 16:50:36 +0100
- Subject: Re: [PATCH GCC8][29/33]New register pressure estimation
- Authentication-results: sourceware.org; auth=none
- References: <VI1PR0802MB21762496F9B6DE1526FFE97EE7190@VI1PR0802MB2176.eurprd08.prod.outlook.com> <CAFiYyc0RHp6Z8dm+N4ruXqS5BtXDK_Z8TXFoYyqA7E1pUPzNKw@mail.gmail.com>
On Thu, May 11, 2017 at 11:39 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 12:53 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> Currently IVOPTs shares the same register pressure computation with RTL loop invariant pass,
>> which doesn't work very well. This patch introduces specific interface for IVOPTs.
>> The general idea is described in the cover message as below:
>> C) Current implementation shares the same register pressure computation with RTL loop
>> inv pass. It has difficulty in handling (especially large) loop nest, and quite
>> often generating too many candidates (especially for outer loops). This change
>> introduces new register pressure computation. The brief idea is to differentiate
>> (hot) innermost loop and outer loop. for (possibly hot) inner most, more registers
>> are allowed as long as the register pressure is within the range of number of target
>> available registers.
>> It can also help to restrict number of candidates for outer loop.
>> Is it OK?
>
> +/* Determine if current loop is the innermost loop and maybe hot. */
> +
> +static void
> +determine_hot_innermost_loop (struct ivopts_data *data)
> +{
> + data->hot_innermost_loop_p = true;
> + if (!data->speed)
> + return;
>
> err, so when not optimizing for speed we assume all loops (even not innermost)
> are hot and innermost?!
>
> + HOST_WIDE_INT niter = avg_loop_niter (loop);
> + if (niter < PARAM_VALUE (PARAM_AVG_LOOP_NITER)
> + || loop_constraint_set_p (loop, LOOP_C_PROLOG)
> + || loop_constraint_set_p (loop, LOOP_C_EPILOG)
> + || loop_constraint_set_p (loop, LOOP_C_VERSION))
> + data->hot_innermost_loop_p = false;
>
> this needs adjustment for the constraint patch removal. Also looking at niter
> of the loop in question insn't a good metric for hotness. data->speed is the
> best guess you get I think (optimize_loop_for_speed_p).
>
> data->speed = optimize_loop_for_speed_p (loop);
> + determine_hot_innermost_loop (data);
>
> data->hot_innermost_loop_p = determine_hot_innermost_loop (data);
>
> would be more consistent here.
Hi,
I removed the hot innermost part and here is the updated version. Is it OK?
Thanks,
bin
2017-05-11 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-loop-ivopts.c (ivopts_estimate_reg_pressure): New
reg_pressure model function.
(ivopts_global_cost_for_size): Delete.
(determine_set_costs, iv_ca_recount_cost): Call new model function
ivopts_estimate_reg_pressure.
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>> 2017-04-11 Bin Cheng <bin.cheng@arm.com>
>>
>> * tree-ssa-loop-ivopts.c (struct ivopts_data): New field.
>> (ivopts_estimate_reg_pressure): New reg_pressure model function.
>> (ivopts_global_cost_for_size): Delete.
>> (determine_set_costs, iv_ca_recount_cost): Call new model function
>> ivopts_estimate_reg_pressure.
>> (determine_hot_innermost_loop): New.
>> (tree_ssa_iv_optimize_loop): Call above function.
From 3ca5cb6bafb516c68cad9d6fd3adbbe73bec4d19 Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Fri, 10 Mar 2017 11:03:16 +0000
Subject: [PATCH 1/9] ivopt-reg_pressure-model-20170225.txt
---
gcc/tree-ssa-loop-ivopts.c | 49 ++++++++++++++++++++++++++++++++++++----------
1 file changed, 39 insertions(+), 10 deletions(-)
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 8b228ca..7caed10 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -5531,17 +5531,46 @@ determine_iv_costs (struct ivopts_data *data)
fprintf (dump_file, "\n");
}
-/* Calculates cost for having N_REGS registers. This number includes
- induction variables, invariant variables and invariant expressions. */
+/* Estimate register pressure for loop having N_INVS invariants and N_CANDS
+ induction variables. Note N_INVS includes both invariant variables and
+ invariant expressions. */
static unsigned
-ivopts_global_cost_for_size (struct ivopts_data *data, unsigned n_regs)
+ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs,
+ unsigned n_cands)
{
- unsigned cost = estimate_reg_pressure_cost (n_regs,
- data->regs_used, data->speed,
- data->body_includes_call);
- /* Add n_regs to the cost, so that we prefer eliminating ivs if possible. */
- return n_regs + cost;
+ unsigned cost;
+ unsigned n_old = data->regs_used, n_new = n_invs + n_cands;
+ unsigned regs_needed = n_new + n_old, available_regs = target_avail_regs;
+ bool speed = data->speed;
+
+ /* If there is a call in the loop body, the call-clobbered registers
+ are not available for loop invariants. */
+ if (data->body_includes_call)
+ available_regs = available_regs - target_clobbered_regs;
+
+ /* If we have enough registers. */
+ if (regs_needed + target_res_regs < available_regs)
+ cost = n_new;
+ /* If close to running out of registers, try to preserve them. */
+ else if (regs_needed <= available_regs)
+ cost = target_reg_cost [speed] * regs_needed;
+ /* If we run out of available registers but the number of candidates
+ does not, we penalize extra registers using target_spill_cost. */
+ else if (n_cands <= available_regs)
+ cost = target_reg_cost [speed] * available_regs
+ + target_spill_cost [speed] * (regs_needed - available_regs);
+ /* If the number of candidates runs out available registers, we penalize
+ extra candidate registers using target_spill_cost * 2. Because it is
+ more expensive to spill induction variable than invariant. */
+ else
+ cost = target_reg_cost [speed] * available_regs
+ + target_spill_cost [speed] * (n_cands - available_regs) * 2
+ + target_spill_cost [speed] * (regs_needed - n_cands);
+
+ /* Finally, add the number of candidates, so that we prefer eliminating
+ induction variables if possible. */
+ return cost + n_cands;
}
/* For each size of the induction variable set determine the penalty. */
@@ -5602,7 +5631,7 @@ determine_set_costs (struct ivopts_data *data)
fprintf (dump_file, " ivs\tcost\n");
for (j = 0; j <= 2 * target_avail_regs; j++)
fprintf (dump_file, " %d\t%d\n", j,
- ivopts_global_cost_for_size (data, j));
+ ivopts_estimate_reg_pressure (data, 0, j));
fprintf (dump_file, "\n");
}
}
@@ -5661,7 +5690,7 @@ iv_ca_recount_cost (struct ivopts_data *data, struct iv_ca *ivs)
comp_cost cost = ivs->cand_use_cost;
cost += ivs->cand_cost;
- cost += ivopts_global_cost_for_size (data, ivs->n_invs + ivs->n_cands);
+ cost += ivopts_estimate_reg_pressure (data, ivs->n_invs, ivs->n_cands);
ivs->cost = cost;
}
--
1.9.1