This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [0/7] Type promotion pass and elimination of zext/sext


On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Attached is the latest version of the patch. With the patches
>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>> 0002-Add-type-promotion-pass.patch and
>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>
>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>> issues in ppc64-linux-gnu regression testing. There are some other test
>> cases which needs adjustment for scanning for some patterns that are not
>> valid now.
>>
>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>> index 16c7981..9cf4f43 100644
>> --- a/gcc/fwprop.c
>> +++ b/gcc/fwprop.c
>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>> new_rtx, rtx_insn *def_insn,
>>    int old_cost = 0;
>>    bool ok;
>>
>> +  /* Value to be substituted is the same, nothing to do.  */
>> +  if (rtx_equal_p (*loc, new_rtx))
>> +    return false;
>> +
>>    update_df_init (def_insn, insn);
>>
>>    /* forward_propagate_subreg may be operating on an instruction with
>
> Which testcase was this on?
>
>> 2. gcc.dg/torture/ftrapv-1.c fails
>> This is because we are checking for the  SImode trapping. With the
>> promotion of the operation to wider mode, this is i think expected. I
>> think the testcase needs updating.
>
> No, it is not expected.  As said earlier you need to refrain from promoting
> integer operations that trap.  You can use ! operation_no_trapping_overflow
> for this.
>
>> 3. gcc.dg/sms-3.c fails
>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>> am looking into it.
>>
>>
>> I also have the following issues based on the previous review (as posted
>> in the previous patch). Copying again for the review purpose.
>>
>> 1.
>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>> on DEFs and fixup_uses on USEs.
>>
>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>> there anyway to iterate over this. I have added gcc_assert to make sure
>> that promote_ssa is called only once.
>
>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>
> with --disable-checking this will be compiled away so you need to do
> the assert in a separate statement.
>
>> 2.
>>> Instead of this you should, in promote_all_stmts, walk over all uses
>> doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +    case GIMPLE_NOP:
>>> +       {
>>> +         if (SSA_NAME_VAR (def) == NULL)
>>> +           {
>>> +             /* Promote def by fixing its type for anonymous def.  */
>>> +             TREE_TYPE (def) = promoted_type;
>>> +           }
>>> +         else
>>> +           {
>>> +             /* Create a promoted copy of parameters.  */
>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> I experimented with get_or_create_default_def. Here  we have to have a
>> SSA_NAME_VAR (def) of promoted type.
>>
>> In the attached patch I am doing the following and seems to work. Does
>> this looks OK?
>>
>> +         }
>> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>> +         {
>> +           tree var = copy_node (SSA_NAME_VAR (def));
>> +           TREE_TYPE (var) = promoted_type;
>> +           TREE_TYPE (def) = promoted_type;
>> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>> +         }
>
> I believe this will wreck the SSA default-def map so you should do
>
>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>   tree var = create_tmp_reg (promoted_type);
>   TREE_TYPE (def) = promoted_type;
>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>   set_ssa_default_def (cfun, var, def);
>
> instead.
>
>> I prefer to promote def as otherwise iterating over the uses and
>> promoting can look complicated (have to look at all the different types
>> of stmts again and do the right thing as It was in the earlier version
>> of this before we move to this approach)
>>
>> 3)
>>> you can also transparently handle constants for the cases where promoting
>>> is required.  At the moment their handling is interwinded with the def
>> promotion
>>> code.  That makes the whole thing hard to follow.
>>
>>
>> I have updated the comments with:
>>
>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>> +   promote only the constants in conditions part of the COND_EXPR.
>> +
>> +   We promote the constants when the associated operands are promoted.
>> +   This usually means that we promote the constants when we promote the
>> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
>> +   can promote only when we promote the other operand. Therefore, this
>> +   is done during fixup_use.  */
>>
>>
>> 4)
>> I am handling gimple_debug separately to avoid any code difference with
>> and without -g option. I have updated the comments for this.
>>
>> 5)
>> I also noticed that tree-ssa-uninit sometimes gives false positives due
>> to the assumptions
>> it makes. Is it OK to move this pass before type promotion? I can do the
>> testings and post a separate patch with this if this OK.
>
> Hmm, no, this needs more explanation (like a testcase).
>
>> 6)
>> I also removed the optimization that prevents some of the redundant
>> truncation/extensions from type promotion pass, as it dosent do much as
>> of now. I can send a proper follow up patch. Is that OK?
>
> Yeah, that sounds fine.
>
>> I also did a simple test with coremark for the latest patch. I compared
>> the code size for coremark for linux-gcc with -Os. Results are as
>> reported by the "size" utility. I know this doesn't mean much but can
>> give some indication.
>>         Base            with pass       Percentage improvement
>> ==============================================================
>> arm     10476           10372           0.9927453226
>> aarch64 9545            9521            0.2514405448
>> ppc64   12236           12052           1.5037593985
>>
>>
>> After resolving the above issues, I would like propose that we  commit
>> the pass as not enabled by default (even though the patch as it stands
>> enabled by default - I am doing it for testing purposes).
>
> Hmm, we don't like to have passes that are not enabled by default with any
> optimization level or for any target.  Those tend to bitrot quickly :(
>
> Did you do any performance measurements yet?
>
> Looking over the pass in detail now (again).

Ok, so still looking at the basic operation scheme.

      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
        {
          use = USE_FROM_PTR (op);
          if (TREE_CODE (use) == SSA_NAME
              && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
            promote_ssa (use, &gsi);
          fixup_use (stmt, &gsi, op, use);
        }

      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
        promote_ssa (def, &gsi);

the GIMPLE_NOP handling in promote_ssa but when processing uses looks
backwards.  As those are implicitely defined in the entry block you may
better just iterate over all default defs before the dominator walk like so

  unsigned n = num_ssa_names;
  for (i = 1; i < n; ++i)
    {
      tree name = ssa_name (i);
      if (name
          && SSA_NAME_IS_DEFAULT_DEF
          && ! has_zero_uses (name))
       promote_default_def (name);
    }

I see promote_cst_in_stmt in both promote_ssa and fixup_use.  Logically
it belongs to use processing, but on a stmt granularity.  Thus between
iterating over all uses and iteration over all defs call promote_cst_in_stmt
on all stmts.  It's a bit awkward as it expects to be called from context
that knows whether promotion is necessary or not.

/* Create an ssa with TYPE to copy ssa VAR.  */
static tree
make_promoted_copy (tree var, gimple *def_stmt, tree type)
{
  tree new_lhs = make_ssa_name (type, def_stmt);
  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
  return new_lhs;
}

as you are generating a copy statement I don't see why you need to copy
SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will
be used in a PHI node directly AFAICS).  Merging make_promoted_copy
and the usually following extension stmt generation plus insertion into
a single helper would make that obvious.

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

You should use ->get (), not ->get_or_insert here.

      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
                                               use, NULL_TREE);

you can avoid the trailing NULL_TREE here.

        gimple *copy_stmt =
          zero_sign_extend_stmt (temp, use,
                                 TYPE_UNSIGNED (old_type),
                                 TYPE_PRECISION (old_type));

coding style says the '=' goes to the next line, thus

    gimple *copy_stmt
       = zero_sign_extend_stmt ...

/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
   Assign the zero/sign extended value in NEW_VAR.  gimple statement
   that performs the zero/sign extension is returned.  */
static gimple *
zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
{

looks like instead of unsigned_p/width you can pass in a  type instead.

    /* Sign extend.  */
    stmt = gimple_build_assign (new_var,
                                SEXT_EXPR,
                                var, build_int_cst (TREE_TYPE (var), width));

use size_int (width) instead.

/* Convert constant CST to TYPE.  */
static tree
convert_int_cst (tree type, tree cst, signop sign = SIGNED)

no need for a default argument

{
  wide_int wi_cons = fold_convert (type, cst);
  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
  return wide_int_to_tree (type, wi_cons);
}

I wonder why this function is needed at all and you don't just call
fold_convert (type, cst)?

/* Return true if the tree CODE needs the propmoted operand to be
   truncated (when stray bits are set beyond the original type in
   promoted mode) to preserve the semantics.  */
static bool
truncate_use_p (enum tree_code code)
{

a conservatively correct predicate would implement the inversion,
not_truncated_use_p because if you miss any tree code the
result will be unnecessary rather than missed truncations.

static bool
type_precision_ok (tree type)
{
  return (TYPE_PRECISION (type)
          == GET_MODE_PRECISION (TYPE_MODE (type)));
}

/* Return the promoted type for TYPE.  */
static tree
get_promoted_type (tree type)
{
  tree promoted_type;
  enum machine_mode mode;
  int uns;

  if (POINTER_TYPE_P (type)
      || !INTEGRAL_TYPE_P (type)
      || !type_precision_ok (type))

the type_precision_ok check is because SEXT doesn't work
properly for bitfield types?  I think we want to promote those
to their mode precision anyway.  We just need to use
sth different than SEXT here (the bitwise-and works of course)
or expand SEXT from non-mode precision differently (see
expr.c REDUCE_BIT_FIELD which expands it as a
lshift/rshift combo).  Eventually this can be left for a followup
though it might get you some extra testing coverage on
non-promote-mode targets.

/* Return true if ssa NAME is already considered for promotion.  */
static bool
ssa_promoted_p (tree name)
{
  if (TREE_CODE (name) == SSA_NAME)
    {
      unsigned int index = SSA_NAME_VERSION (name);
      if (index < n_ssa_val)
        return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
    }
  return true;

better than this default assert you pass in an SSA name.

isn't the bitmap somewhat redundant with the hash-map?
And you could combine both by using a vec<ssa_name_info *> indexed
by SSA_NAME_VERSION ()?

         if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
               == tcc_comparison)
              || truncate_use_p (gimple_assign_rhs_code (stmt)))

you always check for tcc_omparison when checking for truncate_use_p
so just handle it there (well, as said above, implement conservative
predicates).

  switch (gimple_code (stmt))
    {
    case GIMPLE_ASSIGN:
      if (promote_cond
          && gimple_assign_rhs_code (stmt) == COND_EXPR)
        {

looking at all callers this condition is never true.

          tree new_op = build2 (TREE_CODE (op), type, op0, op1);

as tcc_comparison class trees are not shareable you don't
need to build2 but can directly set TREE_OPERAND (op, ..) to the
promoted value.  Note that rhs1 may still just be an SSA name
and not a comparison.

    case GIMPLE_PHI:
        {
          /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
          gphi *phi = as_a <gphi *> (stmt);
          FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
            {
              op = USE_FROM_PTR (oprnd);
              index = PHI_ARG_INDEX_FROM_USE (oprnd);
              if (TREE_CODE (op) == INTEGER_CST)
                SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
            }

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

  tree promoted_type = info->promoted_type;
  tree old_type = info->type;
  bool do_not_promote = false;

  switch (gimple_code (stmt))
    {
 ....
    default:
      break;
    }

do_not_promote = false is not conservative.  Please place a
gcc_unreachable () in the default case.

I see you handle debug stmts here but that case cannot be reached.

/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
   different sequence with and without -g.  This can  happen when promoting
   SSA that are defined with GIMPLE_NOP.  */

but that's only because you choose to unconditionally handle GIMPLE_NOP uses...

Richard.


> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]