[Patch ping] Strength reduction

Wed Jun 20 11:29:00 GMT 2012

On Thu, Jun 14, 2012 at 3:21 PM, William J. Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Pro forma ping. :)

;)

I notice (with all of these functions)

+unsigned
+negate_cost (enum machine_mode mode, bool speed)
+{
+  static unsigned costs[NUM_MACHINE_MODES];
+  rtx seq;
+  unsigned cost;
+
+  if (costs[mode])
+    return costs[mode];
+
+  start_sequence ();
+  force_operand (gen_rtx_fmt_e (NEG, mode,
+				gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
+		 NULL_RTX);
+  seq = get_insns ();
+  end_sequence ();
+
+  cost = seq_cost (seq, speed);
+  if (!cost)
+    cost = 1;

that the cost[] array is independent on the speed argument.  Thus whatever
comes first determines the cost.  Odd, and probably not good.  A fix
would be appreciated (even for the current code ...) - simply make the
array costs[NUM_MACHINE_MODES][2].

As for the renaming - can you name the functions consistently?  Thus
the above would be negate_reg_cost?  And maybe rename the other
FIXME function, too?

Index: gcc/tree-ssa-strength-reduction.c
===================================================================

--- gcc/tree-ssa-strength-reduction.c	(revision 0)
+++ gcc/tree-ssa-strength-reduction.c	(revision 0)
@@ -0,0 +1,1611 @@
+/* Straight-line strength reduction.
+   Copyright (C) 2012  Free Software Foundation, Inc.

I know we have these 'tree-ssa-' names, but really this is gimple-ssa now ;)
So, please name it gimple-ssa-strength-reduction.c.

+  /* Access to the statement for subsequent modification.  Cached to
+     save compile time.  */
+  gimple_stmt_iterator cand_gsi;

this is a iterator for cand_stmt?  Then caching it is no longer necessary
as the iterator is the stmt itself after recent infrastructure changes.

+/* Hash table embodying a mapping from statements to candidates.  */
+static htab_t stmt_cand_map;
...
+static hashval_t
+stmt_cand_hash (const void *p)
+{
+  return htab_hash_pointer (((const_slsr_cand_t) p)->cand_stmt);
+}

use a pointer-map instead.

+/* Callback to produce a hash value for a candidate chain header.  */
+
+static hashval_t
+base_cand_hash (const void *p)
+{
+  tree ssa_name = ((const_cand_chain_t) p)->base_name;
+
+  if (TREE_CODE (ssa_name) != SSA_NAME)
+    return (hashval_t) 0;
+
+  return (hashval_t) SSA_NAME_VERSION (ssa_name);
+}

does it ever happen that ssa_name is not an SSA_NAME?  I'm not sure
the memory savings over simply using a fixed-size (num_ssa_names)
array indexed by SSA_NAME_VERSION pointing to the chain is worth
using a hashtable for this?

+  node = (cand_chain_t) pool_alloc (chain_pool);
+  node->base_name = c->base_name;

If you never free pool entries it's more efficient to use an obstack.
alloc-pool
only pays off if you get freed item re-use.

+  switch (gimple_assign_rhs_code (gs))
+    {
+    case MULT_EXPR:
+      rhs2 = gimple_assign_rhs2 (gs);
+
+      if (TREE_CODE (rhs2) == INTEGER_CST)
+	return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
+
+      if (TREE_CODE (rhs1) == INTEGER_CST)
+	return multiply_by_cost (TREE_INT_CST_LOW (rhs1), lhs_mode, speed);

In theory all commutative statements should have constant operands only
at rhs2 ...

Also you do not verify that the constant fits in a host-wide-int - but maybe
you do not care?  Thus, I'd do

   if (host_integerp (rhs2, 0))
     return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);

or make multiply_by[_const?]_cost take a double-int instead.  Likewise below
for add.

+    case MODIFY_EXPR:
+      /* Be suspicious of assigning costs to copies that may well go away.  */
+      return 0;

MODIFY_EXPR is never a gimple_assign_rhs_code.  Simple copies have
a code of SSA_NAME for example.  But as you assert if you get to an
unhandled code I wonder why you needed the above ...

+static slsr_cand_t
+base_cand_from_table (tree base_in)
+{
+  slsr_cand mapping_key;
+
+  gimple def = SSA_NAME_DEF_STMT (base_in);
+  if (!def)
+    return (slsr_cand_t) NULL;
+
+  mapping_key.cand_stmt = def;
+  return (slsr_cand_t) htab_find (stmt_cand_map, &mapping_key);

isn't that reachable via the base-name -> chain mapping for base_in?

+  if (TREE_CODE (rhs2) == SSA_NAME && operand_equal_p (rhs1, rhs2, 0))
+    return;

SSA_NAMEs can be compared by pointer equality, thus the above is
equivalent to

  if (TREE_CODE (rhs2) == SSA_NAME && rhs1 == rhs2)

or even just

  if (rhs1 == rhs2)

applies elsewhere as well.

+/* Return TRUE iff PRODUCT is an integral multiple of FACTOR, and return
+   the multiple in *MULTIPLE.  Otherwise return FALSE and leave *MULTIPLE
+   unchanged.  */
+/* ??? - Should this be moved to double-int.c?  */

I think so.

+static bool
+double_int_multiple_of (double_int product, double_int factor,
+			bool unsigned_p, double_int *multiple)
+{
+  double_int remainder;
+  double_int quotient = double_int_divmod (product, factor, unsigned_p,
+					   TRUNC_DIV_EXPR, &remainder);
+  if (double_int_zero_p (remainder))
+    {
+      *multiple = quotient;
+      return true;
+    }
+
+  return false;
+}


In general I find a lot of code of similar structure, factoring bits may make
it easier to read.  Obvious candidates include:

+      /* Add the interpretation to the statement-candidate mapping.  */
+      slot = htab_find_slot (stmt_cand_map, c, INSERT);
+      gcc_assert (!*slot);
+      *slot = c;

into a function add_cand_for_stmt ()

+      if (TREE_CODE (rhs1) != SSA_NAME || TREE_CODE (rhs2) != INTEGER_CST)
+	return;

doing some simple checks of this kind in the function walking all statements
and pass in operands and operation code.

+/* Return TRUE if GS is a statement that defines an SSA name from
+   a NOP_EXPR and is legal for us to combine an add and multiply
+   across.  This is legitimate for casts from a signed type to
+   a signed or unsigned type of the same or larger size.  It is not
+   legitimate to convert any unsigned type to a signed type, or
+   to an unsigned type of a different size.
+
+   The reasoning here is that signed integer overflow is undefined,
+   so any program that was expecting overflow that no longer occurs
+   is not correct.  Unsigned integers, however, have wrap semantics,
+   and it is reasonable for programs to assume an overflowing add
+   will wrap.
+
+   With -fwrapv, signed integers also have wrap semantics, so widening
+   casts are not allowed then either.  */
+
+static bool
+legal_cast_p (gimple gs)
+{
+  tree lhs, rhs, lhs_type, rhs_type;
+  unsigned lhs_size, rhs_size, lhs_uns, rhs_uns;
+
+  if (!is_gimple_assign (gs)
+      || TREE_CODE (gimple_assign_lhs (gs)) != SSA_NAME

That's always true if the code is NOP_EXPR

+      || gimple_assign_rhs_code (gs) != NOP_EXPR)

CONVERT_EXPR_CODE_P ()

+    return false;
+
+  lhs = gimple_assign_lhs (gs);
+  rhs = gimple_assign_rhs1 (gs);
+
+  if (TREE_CODE (rhs) != SSA_NAME)
+    return false;

Likewise.

+  lhs_type = TREE_TYPE (SSA_NAME_VAR (lhs));
+  rhs_type = TREE_TYPE (SSA_NAME_VAR (rhs));

Get the type from the SSA_NAMEs themselves, no need to lookup
SSA_NAME_VAR.  If it happens you ICE that way you are looking
at released SSA names ...

+  lhs_size = TYPE_PRECISION (lhs_type);
+  rhs_size = TYPE_PRECISION (rhs_type);
+  lhs_uns = TYPE_UNSIGNED (lhs_type);
+  rhs_uns = TYPE_UNSIGNED (rhs_type);
+
+  if (lhs_size < rhs_size
+      || (rhs_uns && !lhs_uns)
+      || (rhs_uns && lhs_uns && rhs_size != lhs_size)
+      || (!rhs_uns && flag_wrapv && rhs_size != lhs_size))
+    return false;

So we basically check whether the lhs can represent all values of the rhs?
So ...

    if (lhs_size > rhs_size)
      return true;

is missing?  Because you'd reject a widening of an unsigned int to an
unsigned long.

As for your handling of flag_wrapv and the unsigned flag, please try
to use the TYPE_OVERFLOW_UNDEFINED/TYPE_OVERFLOW_WRAPS
type properties instead for the parts that care about overflow behavior
and not value-representation.

+  return true;
+}

I have a deja-vu for seeing similar code elsewhere (widening shift/multiply
support in the vector pattern recognizer maybe).  While I may understand
what the textual description says including an example would explain
things better.  For example

"Return TRUE if GS is a statement that defines an SSA name from
 a conversion and is legal for us to associate with an add or multiply.
 Thus transform name = (T) name'; name * X; to name' * X"

But I suppose I got the semantics wrong (and thus the example).  Can you
clarify?

Otherwise the pass looks quite good.  It looks though that it will optimize
cross-basic-block strength-reduction opportunities, eventually causing
cross-BB register livetime increases?  Or is this not an issue?

Thanks for working on this,
Richard.