IVOPT improvement patch

Zdenek Dvorak rakdver@kam.mff.cuni.cz
Sat Jun 5 09:01:00 GMT 2010


Hi,

> patch-3:
> 
> The new expression for the use expressed in terms of ubase, cbase,
> ratio, and iv_cand may contain loop invariant sub-expressions that may
> be hoisted out of  the loop later. The patch implements a mechanism to
> evaluate the additional register pressure caused by such expressions
> (that can not be constant folded).

what are the benchmark results of this part of the patch, separately from the
rest of the changes?

> The patch also makes sure that variant part of the use expression ( a
> sum of product) gets assigned to the index part of  the target_mem_ref
> first to expose loop invariant code motion.

This looks ok to me.

> +/* Returns true if AFF1 and AFF2 are identical.  */
> +
> +static bool
> +compare_aff_trees (aff_tree *aff1, aff_tree *aff2)
> +{
> +  unsigned i;
> +
> +  if (aff1->n != aff2->n)
> +    return false;
> +
> +  for (i = 0; i < aff1->n; i++)
> +    {
> +      if (double_int_cmp (aff1->elts[i].coef, aff2->elts[i].coef, 0) != 0)
> +        return false;
> +
> +      if (!operand_equal_p (aff1->elts[i].val, aff2->elts[i].val, 0))
> +        return false;
> +    }
> +  return true;
> +}

No particular order is guaranteed for the elements of the affine combination.
So this function will only work if the order is the same in AFF1 and AFF2 by chance.

> +/* Returns true if expression UBASE - RATIO * CBASE requires a new compiler
> +   generated temporary.  */

The comment should explain in more detail what is tested here; e.g., it is not
clear from the currend description why false is returned for SSA_NAME - INTEGER_CST.

> +static bool
> +create_loop_invariant_temp (tree ubase, tree cbase, HOST_WIDE_INT ratio)
> +{

...

> +      if (TREE_CODE (ubase) == ADDR_EXPR
> +        && TREE_CODE (cbase) == ADDR_EXPR)
> +        {
> +          tree usym, csym;
> +
> +          usym = TREE_OPERAND (ubase, 0);
> +          csym = TREE_OPERAND (cbase, 0);
> +          if (TREE_CODE (usym) == ARRAY_REF)
> +            {
> +              tree ind = TREE_OPERAND (usym, 1);
> +              if (TREE_CODE (ind) == INTEGER_CST
> +                  && host_integerp (ind, 0)
> +                  && TREE_INT_CST_LOW (ind) == 0)
> +                usym = TREE_OPERAND (usym, 0);
> +            }
> +          if (TREE_CODE (csym) == ARRAY_REF)
> +            {
> +              tree ind = TREE_OPERAND (csym, 1);
> +              if (TREE_CODE (ind) == INTEGER_CST
> +                  && host_integerp (ind, 0)
> +                  && TREE_INT_CST_LOW (ind) == 0)
> +                csym = TREE_OPERAND (csym, 0);
> +            }
> +          if (usym == csym)
> +            return false;

Trees should not be compared by ==
Anyway, you had some compile time problems here? Or what is the purpose of the above piece of code?

> +/* Moves the loop variant part V in linear address ADDR to be the index
> +   of PARTS.  */
> +
> +static void
> +move_variant_to_index (struct mem_address *parts, aff_tree *addr, tree v)
> +{
> +  unsigned i;
> +  tree val = NULL_TREE;
> +
> +  gcc_assert (!parts->index);
> +  for (i = 0; i < addr->n; i++)
> +    {
> +      val = addr->elts[i].val;
> +      if (val == v)

operand_equal_p

> +	break;
> +    }
> +
> +  if (i == addr->n)
> +    return;
> +
> +  parts->index = fold_convert (sizetype, val);
> +  parts->step = double_int_to_tree (sizetype, addr->elts[i].coef);
> +  aff_combination_remove_elt (addr, i);
> +}
> +
>  /* Adds ELT to PARTS.  */
>  
>  static void
> @@ -553,7 +578,8 @@ most_expensive_mult_to_index (tree type,
>  
>  /* Splits address ADDR for a memory access of type TYPE into PARTS.
>     If BASE_HINT is non-NULL, it specifies an SSA name to be used
> -   preferentially as base of the reference.
> +   preferentially as base of the reference, and IV_CAND is the selected
> +   iv candidate used in ADDR.

This comment should explain what is IV_CAND used for; the current comment is
meaningless if the function is considered separately.

>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary
>     computations are emitted in front of GSI.  TYPE is the mode
> -   of created memory reference.  */
> +   of created memory reference. IV_CAND is the selected iv candidate in ADDR,
> +   and IS_CAND_BASE is a flag indidcats if IV_CAND comes from a base address
> +   object.  */
>  
>  tree
>  create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
> -		tree base_hint, bool speed)
> +		tree iv_cand, tree base_hint, bool speed)

The mention of IS_CAND_BASE should be removed from the comment.

Zdenek



More information about the Gcc-patches mailing list