This is the mail archive of the
mailing list for the GCC project.
Re: rtx_cost of insns
- From: Oleg Endo <oleg dot endo at t-online dot de>
- To: Alan Modra <amodra at gmail dot com>
- Cc: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>, Jeff Law <law at redhat dot com>, gcc at gcc dot gnu dot org
- Date: Mon, 29 Jun 2015 17:10:39 +0900
- Subject: Re: rtx_cost of insns
- Authentication-results: sourceware.org; auth=none
- References: <20150622055710 dot GQ1723 at bubble dot grove dot modra dot org> <558A3AA9 dot 6090406 at redhat dot com> <20150624091846 dot GC1723 at bubble dot grove dot modra dot org> <558ADF04 dot 9040807 at redhat dot com> <558BF3F7 dot 6080702 at foss dot arm dot com> <20150629074639 dot GG1723 at bubble dot grove dot modra dot org>
On 29 Jun 2015, at 16:46, Alan Modra <firstname.lastname@example.org> wrote:
> On Thu, Jun 25, 2015 at 01:28:39PM +0100, Richard Earnshaw wrote:
>> Perhaps the best thing to do is to use the OUTER code to spot the
>> specific case where you've got a SET and return non-zero in that case.
> That's exactly the path I've been following. It's not as easy as it
> First, some backends call rtx_cost from their targetm.rtx_costs.
> ix86_rtx_costs for instance has this
> case PLUS:
> if (val == 2 || val == 4 || val == 8)
> *total = cost->lea;
> *total += rtx_cost (XEXP (XEXP (x, 0), 1),
> outer_code, opno, speed);
> *total += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0),
> outer_code, opno, speed);
> *total += rtx_cost (XEXP (x, 1), outer_code, opno, speed);
> return true;
> which, when using a non-zero register move cost, results in
> Successfully matched this instruction:
> (set (reg:DI 198 [ D.74663 ])
> (plus:DI (plus:DI (reg/v/f:DI 172 [ use_entry ])
> (reg:DI 196 [ D.74662 ]))
> (const_int -32 [0xffffffffffffffe0])))
> rejecting combination of insns 179 and 180
> original costs 6 + 4 = 10
> replacement cost 15
> So here the x86 backend is calculating the cost of an lea, plus the
> cost of (reg:DI 196), plus the cost of (reg/v/f:DI 172), plus the cost
> of (const_int -32). outer_code is SET. That means we add two
> register moves, increasing the overall cost from 7 to 15.
> The second problem I've hit is that fwprop.c:should_replace_address
> has this:
> /* If the addresses have equivalent cost, prefer the new address
> if it has the highest `set_src_cost'. That has the potential of
> eliminating the most insns without additional costs, and it
> is the same that cse.c used to do. */
> if (gain == 0)
> gain = (set_src_cost (new_rtx, VOIDmode, speed)
> - set_src_cost (old_rtx, VOIDmode, speed));
> return (gain > 0);
> If register moves have the same cost as adding a small constant to a
> register, then this code no longer replaces a pseudo with its value as
> an offset from a base. I think this particular problem can be fixed
> quite simply by "return gain >= 0;", but really, this code, like the
> x86 code, is expecting the cost of a register move to be zero.
> You'll notice that these example problems are not trying to cost a
> whole instruction. In both cases they want the cost of just a piece
> of an instruction, but rtx_cost is called in a way that is
> indistinguishable from other code that calls rtx_cost on whole
> register move instructions.
> The real difficulty is in separating out the whole insn cases from the
> partial insn cases.
> Note that we already have insn_rtx_cost, and it returns a minimum cost
> for a SET, so register move insns get a cost of 1 insn. However,
> despite insn_rtx_cost starting life in combine.c, even combine doesn't
> use it in all whole insn cases. :-(
Quite often, more complex (combine) insns have to be matched manually using C/C++ code in order to implement the costs function. To avoid that, maybe we could have target independent insn attributes that carry the costs? That would be much be much easier/faster (at least) for combine to lookup and is also easier to maintain in the backend.
It's also possible to implement that in a target specific way. Like in the costs function, constructing a temporary fake insn, recog it, lookup the attribute. However, this will pointlessly invoke recog twice. At the time when combine gets the insn costs it already has invoked recog.