[Patch ping] Strength reduction
Richard Guenther
richard.guenther@gmail.com
Wed Jun 20 11:29:00 GMT 2012
On Thu, Jun 14, 2012 at 3:21 PM, William J. Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Pro forma ping. :)
;)
I notice (with all of these functions)
+unsigned
+negate_cost (enum machine_mode mode, bool speed)
+{
+ static unsigned costs[NUM_MACHINE_MODES];
+ rtx seq;
+ unsigned cost;
+
+ if (costs[mode])
+ return costs[mode];
+
+ start_sequence ();
+ force_operand (gen_rtx_fmt_e (NEG, mode,
+ gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
+ NULL_RTX);
+ seq = get_insns ();
+ end_sequence ();
+
+ cost = seq_cost (seq, speed);
+ if (!cost)
+ cost = 1;
that the cost[] array is independent on the speed argument. Thus whatever
comes first determines the cost. Odd, and probably not good. A fix
would be appreciated (even for the current code ...) - simply make the
array costs[NUM_MACHINE_MODES][2].
As for the renaming - can you name the functions consistently? Thus
the above would be negate_reg_cost? And maybe rename the other
FIXME function, too?
Index: gcc/tree-ssa-strength-reduction.c
===================================================================
--- gcc/tree-ssa-strength-reduction.c (revision 0)
+++ gcc/tree-ssa-strength-reduction.c (revision 0)
@@ -0,0 +1,1611 @@
+/* Straight-line strength reduction.
+ Copyright (C) 2012 Free Software Foundation, Inc.
I know we have these 'tree-ssa-' names, but really this is gimple-ssa now ;)
So, please name it gimple-ssa-strength-reduction.c.
+ /* Access to the statement for subsequent modification. Cached to
+ save compile time. */
+ gimple_stmt_iterator cand_gsi;
this is a iterator for cand_stmt? Then caching it is no longer necessary
as the iterator is the stmt itself after recent infrastructure changes.
+/* Hash table embodying a mapping from statements to candidates. */
+static htab_t stmt_cand_map;
...
+static hashval_t
+stmt_cand_hash (const void *p)
+{
+ return htab_hash_pointer (((const_slsr_cand_t) p)->cand_stmt);
+}
use a pointer-map instead.
+/* Callback to produce a hash value for a candidate chain header. */
+
+static hashval_t
+base_cand_hash (const void *p)
+{
+ tree ssa_name = ((const_cand_chain_t) p)->base_name;
+
+ if (TREE_CODE (ssa_name) != SSA_NAME)
+ return (hashval_t) 0;
+
+ return (hashval_t) SSA_NAME_VERSION (ssa_name);
+}
does it ever happen that ssa_name is not an SSA_NAME? I'm not sure
the memory savings over simply using a fixed-size (num_ssa_names)
array indexed by SSA_NAME_VERSION pointing to the chain is worth
using a hashtable for this?
+ node = (cand_chain_t) pool_alloc (chain_pool);
+ node->base_name = c->base_name;
If you never free pool entries it's more efficient to use an obstack.
alloc-pool
only pays off if you get freed item re-use.
+ switch (gimple_assign_rhs_code (gs))
+ {
+ case MULT_EXPR:
+ rhs2 = gimple_assign_rhs2 (gs);
+
+ if (TREE_CODE (rhs2) == INTEGER_CST)
+ return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
+
+ if (TREE_CODE (rhs1) == INTEGER_CST)
+ return multiply_by_cost (TREE_INT_CST_LOW (rhs1), lhs_mode, speed);
In theory all commutative statements should have constant operands only
at rhs2 ...
Also you do not verify that the constant fits in a host-wide-int - but maybe
you do not care? Thus, I'd do
if (host_integerp (rhs2, 0))
return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
or make multiply_by[_const?]_cost take a double-int instead. Likewise below
for add.
+ case MODIFY_EXPR:
+ /* Be suspicious of assigning costs to copies that may well go away. */
+ return 0;
MODIFY_EXPR is never a gimple_assign_rhs_code. Simple copies have
a code of SSA_NAME for example. But as you assert if you get to an
unhandled code I wonder why you needed the above ...
+static slsr_cand_t
+base_cand_from_table (tree base_in)
+{
+ slsr_cand mapping_key;
+
+ gimple def = SSA_NAME_DEF_STMT (base_in);
+ if (!def)
+ return (slsr_cand_t) NULL;
+
+ mapping_key.cand_stmt = def;
+ return (slsr_cand_t) htab_find (stmt_cand_map, &mapping_key);
isn't that reachable via the base-name -> chain mapping for base_in?
+ if (TREE_CODE (rhs2) == SSA_NAME && operand_equal_p (rhs1, rhs2, 0))
+ return;
SSA_NAMEs can be compared by pointer equality, thus the above is
equivalent to
if (TREE_CODE (rhs2) == SSA_NAME && rhs1 == rhs2)
or even just
if (rhs1 == rhs2)
applies elsewhere as well.
+/* Return TRUE iff PRODUCT is an integral multiple of FACTOR, and return
+ the multiple in *MULTIPLE. Otherwise return FALSE and leave *MULTIPLE
+ unchanged. */
+/* ??? - Should this be moved to double-int.c? */
I think so.
+static bool
+double_int_multiple_of (double_int product, double_int factor,
+ bool unsigned_p, double_int *multiple)
+{
+ double_int remainder;
+ double_int quotient = double_int_divmod (product, factor, unsigned_p,
+ TRUNC_DIV_EXPR, &remainder);
+ if (double_int_zero_p (remainder))
+ {
+ *multiple = quotient;
+ return true;
+ }
+
+ return false;
+}
In general I find a lot of code of similar structure, factoring bits may make
it easier to read. Obvious candidates include:
+ /* Add the interpretation to the statement-candidate mapping. */
+ slot = htab_find_slot (stmt_cand_map, c, INSERT);
+ gcc_assert (!*slot);
+ *slot = c;
into a function add_cand_for_stmt ()
+ if (TREE_CODE (rhs1) != SSA_NAME || TREE_CODE (rhs2) != INTEGER_CST)
+ return;
doing some simple checks of this kind in the function walking all statements
and pass in operands and operation code.
+/* Return TRUE if GS is a statement that defines an SSA name from
+ a NOP_EXPR and is legal for us to combine an add and multiply
+ across. This is legitimate for casts from a signed type to
+ a signed or unsigned type of the same or larger size. It is not
+ legitimate to convert any unsigned type to a signed type, or
+ to an unsigned type of a different size.
+
+ The reasoning here is that signed integer overflow is undefined,
+ so any program that was expecting overflow that no longer occurs
+ is not correct. Unsigned integers, however, have wrap semantics,
+ and it is reasonable for programs to assume an overflowing add
+ will wrap.
+
+ With -fwrapv, signed integers also have wrap semantics, so widening
+ casts are not allowed then either. */
+
+static bool
+legal_cast_p (gimple gs)
+{
+ tree lhs, rhs, lhs_type, rhs_type;
+ unsigned lhs_size, rhs_size, lhs_uns, rhs_uns;
+
+ if (!is_gimple_assign (gs)
+ || TREE_CODE (gimple_assign_lhs (gs)) != SSA_NAME
That's always true if the code is NOP_EXPR
+ || gimple_assign_rhs_code (gs) != NOP_EXPR)
CONVERT_EXPR_CODE_P ()
+ return false;
+
+ lhs = gimple_assign_lhs (gs);
+ rhs = gimple_assign_rhs1 (gs);
+
+ if (TREE_CODE (rhs) != SSA_NAME)
+ return false;
Likewise.
+ lhs_type = TREE_TYPE (SSA_NAME_VAR (lhs));
+ rhs_type = TREE_TYPE (SSA_NAME_VAR (rhs));
Get the type from the SSA_NAMEs themselves, no need to lookup
SSA_NAME_VAR. If it happens you ICE that way you are looking
at released SSA names ...
+ lhs_size = TYPE_PRECISION (lhs_type);
+ rhs_size = TYPE_PRECISION (rhs_type);
+ lhs_uns = TYPE_UNSIGNED (lhs_type);
+ rhs_uns = TYPE_UNSIGNED (rhs_type);
+
+ if (lhs_size < rhs_size
+ || (rhs_uns && !lhs_uns)
+ || (rhs_uns && lhs_uns && rhs_size != lhs_size)
+ || (!rhs_uns && flag_wrapv && rhs_size != lhs_size))
+ return false;
So we basically check whether the lhs can represent all values of the rhs?
So ...
if (lhs_size > rhs_size)
return true;
is missing? Because you'd reject a widening of an unsigned int to an
unsigned long.
As for your handling of flag_wrapv and the unsigned flag, please try
to use the TYPE_OVERFLOW_UNDEFINED/TYPE_OVERFLOW_WRAPS
type properties instead for the parts that care about overflow behavior
and not value-representation.
+ return true;
+}
I have a deja-vu for seeing similar code elsewhere (widening shift/multiply
support in the vector pattern recognizer maybe). While I may understand
what the textual description says including an example would explain
things better. For example
"Return TRUE if GS is a statement that defines an SSA name from
a conversion and is legal for us to associate with an add or multiply.
Thus transform name = (T) name'; name * X; to name' * X"
But I suppose I got the semantics wrong (and thus the example). Can you
clarify?
Otherwise the pass looks quite good. It looks though that it will optimize
cross-basic-block strength-reduction opportunities, eventually causing
cross-BB register livetime increases? Or is this not an issue?
Thanks for working on this,
Richard.
More information about the Gcc-patches
mailing list