rtx_cost of insns

On Thu, Jun 25, 2015 at 01:28:39PM +0100, Richard Earnshaw wrote:
> Perhaps the best thing to do is to use the OUTER code to spot the
> specific case where you've got a SET and return non-zero in that case.

That's exactly the path I've been following.  It's not as easy as it

First, some backends call rtx_cost from their targetm.rtx_costs.
ix86_rtx_costs for instance has this

    case PLUS:
	      if (val == 2 || val == 4 || val == 8)
		  *total = cost->lea;
		  *total += rtx_cost (XEXP (XEXP (x, 0), 1),
				      outer_code, opno, speed);
		  *total += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0),
				      outer_code, opno, speed);
		  *total += rtx_cost (XEXP (x, 1), outer_code, opno, speed);
		  return true;
which, when using a non-zero register move cost, results in

Successfully matched this instruction:
(set (reg:DI 198 [ D.74663 ])
    (plus:DI (plus:DI (reg/v/f:DI 172 [ use_entry ])
            (reg:DI 196 [ D.74662 ]))
        (const_int -32 [0xffffffffffffffe0])))
rejecting combination of insns 179 and 180
original costs 6 + 4 = 10
replacement cost 15

So here the x86 backend is calculating the cost of an lea, plus the
cost of (reg:DI 196), plus the cost of (reg/v/f:DI 172), plus the cost
of (const_int -32).  outer_code is SET.  That means we add two
register moves, increasing the overall cost from 7 to 15.

The second problem I've hit is that fwprop.c:should_replace_address
has this:

  /* If the addresses have equivalent cost, prefer the new address
     if it has the highest `set_src_cost'.  That has the potential of
     eliminating the most insns without additional costs, and it
     is the same that cse.c used to do.  */
  if (gain == 0)
    gain = (set_src_cost (new_rtx, VOIDmode, speed)
	    - set_src_cost (old_rtx, VOIDmode, speed));

  return (gain > 0);

If register moves have the same cost as adding a small constant to a
register, then this code no longer replaces a pseudo with its value as
an offset from a base.  I think this particular problem can be fixed
quite simply by "return gain >= 0;", but really, this code, like the
x86 code, is expecting the cost of a register move to be zero.

You'll notice that these example problems are not trying to cost a
whole instruction.  In both cases they want the cost of just a piece
of an instruction, but rtx_cost is called in a way that is
indistinguishable from other code that calls rtx_cost on whole
register move instructions.

The real difficulty is in separating out the whole insn cases from the
partial insn cases.

Note that we already have insn_rtx_cost, and it returns a minimum cost
for a SET, so register move insns get a cost of 1 insn.  However,
despite insn_rtx_cost starting life in combine.c, even combine doesn't
use it in all whole insn cases.  :-(

Alan Modra
Australia Development Lab, IBM

