This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] RFC: Hook for insn costs (v2)?
- From: Georg-Johann Lay <avr at gjlay dot de>
- To: GCC Development <gcc at gcc dot gnu dot org>
- Date: Thu, 13 Jul 2017 15:23:48 +0200
- Subject: Re: [patch] RFC: Hook for insn costs (v2)?
- Authentication-results: sourceware.org; auth=none
- References: <f227448a-8499-775a-d9bb-df90c5fc124b@gjlay.de>
Hi, this is a bit different proposal. Instead of using some
magic value to indicate that the hook returns nothing useful,
the hook has not a 4th argument of bool* to ship that information.
The goal and usage is the same as with the first proposal:
* Allow the back-end to compute correct costs and to see the
full pattern. Reason: middle-end currently hides parts of
patterns that might be relevant and implements strange logic
in some places, e.g. in insn_rtx_costs.
* Don't introduce any performance degradation by changing
current usage / assumptions of rtx_costs.
Would such a change be in order in principle?
Ideas to improve it?
I would then round it up and propose it as a patch.
Johann
On 12.07.2017 15:15, Georg-Johann Lay wrote:
Hi,
the current cost computations in rtlanal.c and maybe other places
suffer from the fact that they are hiding parts of the expressions
from the back-end, like SET_DESTs of single_set or the anatomy of
PARALELLs.
Would it be in order to have a hook like the one attached?
I am aware of that, in an ideal world, there wouldn't be more
than one hook to get rtx costs. But well...
Whilst rtx_costs does in the majority of cases, there are cases
where hiding information leads to performance degradation,
for example when insn combine cooks up a set zero_extract.
combine.c does actually the right thing as it uses insn_rtx_costs,
but insn_rtx_cost is already a lie because it only uses SET_SRC
and digs into PARALELL without exposing the whole story.
The patch just uses a new targetm.insn_cost hook. If the
back-end doesn't come up with something useful, the classic
functions with rtx_costs for sub-rtxes are called.
The purpose is to allow a friendly transition and not no
raise any performance degradations which would be very likely
if we just called rtx_costs with outer_code = INSN.
If a back-end finds it useful to implement this hook and need
the whole story, they can do so. Otherwise, or if it is too
lazy to analyse a specific rtx, they can switch to the old
infrastructure.
Returning a magic value for "unknown" is just an implementation
detail; it could just as well be some bool* that would be set
to true or false depending on whether or not the computation
returned something useful or not.
The patch only touches seq_cost insn_rtx_cost of rtlanal.c.
Would something like this be in order, or is a new hook just
a complete no-go?
Index: target.def
===================================================================
--- target.def (revision 250090)
+++ target.def (working copy)
@@ -3558,6 +3558,41 @@ DEFHOOKPOD
appropriately.",
unsigned int, INVALID_REGNUM)
+/* Compute the cost of an insn resp. something that might become an insn. */
+DEFHOOK
+(insn_costs,
+"This target hook describes the relative costs of an insn.\n\
+\n\
+@var{insn} might be NULL or an @code{INSN_P}.\n\
+@var{pattern} is the pattern of @var{insn} or an rtx that is supposed\n\
+to be used as the pattern of an insn the remainder of the compilation.\n\
+\n\
+In implementing this hook, you can use the construct\n\
+@code{COSTS_N_INSNS (@var{n})} to specify a cost equal to @var{n}\n\
+instructions.\n\
+When optimizing for execution speed, i.e.@: when @var{speed} is\n\
+true, this target hook should be used to estimate the relative\n\
+execution cost of the pattern, and the size cost of the pattern if\n\
+@var{speed} is false.\n\
+\n\
+Use this pattern if a @code{SET_DEST} is needed to calculate the correct\n\
+costs or when you want to see the whole of a @code{PARALLEL} and not only\n\
+parts of it. Notice that for a @code{single_set} you might see a\n\
+@code{PARALLEL} @var{pattern} that contains a @code{SET} together with\n\
+@code{COBBER}s.\n\
+\n\
+If the hook computed the cost, set @var{*cost_computed} to true.\n\
+If the hook implementation choses not to compute the cost for some reason,\n\
+set @var{*cost_computed} to false. This directs the middle-end to use\n\
+other approaches to get the respective costs like calling\n\
+@code{TARGET_RTX_COSTS} for sub-rtxes of @var{pattern}.\n\
+\n\
+Don't implement this hook if it would always set @var{*cost_computed} to\n\
+false. Even if this hook handles all cases, you still need to implement\n\
+@code{TARGET_RTX_COSTS}.",
+ int, (const rtx_insn *insn, rtx pattern, bool speed, bool *cost_computed),
+ default_insn_costs)
+
/* Compute a (partial) cost for rtx X. Return true if the complete
cost has been computed, and false if subexpressions should be
scanned. In either case, *TOTAL contains the cost result. */
Index: targhooks.c
===================================================================
--- targhooks.c (revision 250090)
+++ targhooks.c (working copy)
@@ -2108,4 +2108,11 @@ default_excess_precision (enum excess_pr
return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
}
+int
+default_insn_costs (const rtx_insn*, rtx, bool, bool *cost_computed)
+{
+ * cost_computed = false;
+ return 0;
+}
+
#include "gt-targhooks.h"
Index: targhooks.h
===================================================================
--- targhooks.h (revision 250090)
+++ targhooks.h (working copy)
@@ -264,4 +264,6 @@ extern unsigned int default_min_arithmet
extern enum flt_eval_method
default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED);
+extern int default_insn_costs (const rtx_insn*, rtx, bool, bool*);
+
#endif /* GCC_TARGHOOKS_H */
Index: rtlanal.c
===================================================================
--- rtlanal.c (revision 250090)
+++ rtlanal.c (working copy)
@@ -5262,6 +5262,11 @@ insn_rtx_cost (rtx pat, bool speed)
{
int i, cost;
rtx set;
+ bool cost_computed;
+
+ cost = targetm.insn_costs (NULL, pat, speed, &cost_computed);
+ if (cost_computed)
+ return cost;
/* Extract the single set rtx from the instruction pattern. We
can't use single_set since we only have the pattern. We also
@@ -5318,6 +5323,17 @@ seq_cost (const rtx_insn *seq, bool spee
for (; seq; seq = NEXT_INSN (seq))
{
+ if (INSN_P (seq))
+ {
+ bool icost_computed;
+ int icost = targetm.insn_costs (seq, PATTERN (seq), speed,
+ &icost_computed);
+ if (icost_computed)
+ {
+ cost += icost;
+ continue;
+ }
+ }
set = single_set (seq);
if (set)
cost += set_rtx_cost (set, speed);
Index: doc/tm.texi.in
===================================================================
--- doc/tm.texi.in (revision 250123)
+++ doc/tm.texi.in (working copy)
@@ -4790,6 +4790,8 @@ @samp{fold_range_test ()} is optimal. T
@hook TARGET_OPTAB_SUPPORTED_P
+@hook TARGET_INSN_COSTS
+
@hook TARGET_RTX_COSTS
@hook TARGET_ADDRESS_COST