Basic kill analysis for modref

Jeff Law jeffreyalaw@gmail.com
Mon Nov 15 19:00:13 GMT 2021



On 11/15/2021 11:51 AM, H.J. Lu via Gcc-patches wrote:
> On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>>> I think you want get_addr_base_and_unit_offset here.  All
>>>> variable indexed addresses are in separate stmts.  That also means
>>>> you can eventually work with just byte sizes/offsets?
>>> Will do.  The access range in modref summary is bit based (since we want
>>> to disabiguate bitfields like we do in rest of alias oracle) but indeed
>>> this part cna be in bytes.
>> Actually after the unifiation I can just use get_ao_ref which will call
>> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
>> I also noticed that I ended up duplicating the code matching bases
>> and ranges which is done already twice in the function - once for store
>> targets and once for MEMSET and friends.  The later copy lacked overflow
>> checks so I took the first copy and moved it to helper function.  This
>> makes the gimple part of patch really straighforward: just build ao_ref
>> if possible and then pass it to this function.
>>
>> I also added statistics.
>>
>> I have bootstrapped/regtsed on x86_64-linux the updated patch and
>> comitted it so I can break out the patches that depends on it.
>> I have patch improving the kill tracking at modref side and also the
>> kill oracle itself can use fnspec and does not need to special case
>> mem* functions.
>>
>> For cc1plus LTO link I now get:
>>
>> Alias oracle query stats:
>>    refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>>    on_includes: 12539931 disambiguations, 39864841 queries
>>    ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>>    call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>>    stmt_kills_ref_p: 107503 kills, 5699589 queries
>>    nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>>    nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must overlaps, 96698 queries
>>    aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>>    TBAA oracle: 28248334 disambiguations 104710521 queries
>>                 15220245 are in alias set 0
>>                 8905994 queries asked about the same object
>>                 98 queries asked about the same alias set
>>                 0 access volatile
>>                 50371110 are dependent in the DAG
>>                 1964740 are aritificially in conflict with void *
>>
>> Modref stats:
>>    modref kill: 52 kills, 6655 queries
>>    modref use: 25204 disambiguations, 692151 queries
>>    modref clobber: 2309709 disambiguations, 21877806 queries
>>    5320532 tbaa queries (0.243193 per modref query)
>>    761785 base compares (0.034820 per modref query)
>>
>> PTA query stats:
>>    pt_solution_includes: 12539931 disambiguations, 39864841 queries
>>    pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>>
>> Newly we get statis of kill oracle itself:
>>    stmt_kills_ref_p: 107503 kills, 5699589 queries
>> and the modref part:
>>    modref kill: 52 kills, 6655 queries
>> So an improvemnet over 1 kill using modref I had before. Still not
>> really great.
>>
>> Honza
>>
>> gcc/ChangeLog:
>>
>>              * ipa-modref-tree.c (modref_access_node::update_for_kills): New
>>              member function.
>>              (modref_access_node::merge_for_kills): Likewise.
>>              (modref_access_node::insert_kill): Likewise.
>>              * ipa-modref-tree.h (modref_access_node::update_for_kills,
>>              modref_access_node::merge_for_kills, modref_access_node::insert_kill):
>>              Declare.
>>              (modref_access_node::useful_for_kill): New member function.
>>              * ipa-modref.c (modref_summary::useful_p): Release useless kills.
>>              (lto_modref_summary): Add kills.
>>              (modref_summary::dump): Dump kills.
>>              (record_access): Add mdoref_access_node parameter.
>>              (record_access_lto): Likewise.
>>              (merge_call_side_effects): Merge kills.
>>              (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
>>              (struct summary_ptrs): Add always_executed filed.
>>              (analyze_load): Update.
>>              (analyze_store): Update; record kills.
>>              (analyze_stmt): Add always_executed; record kills in clobbers.
>>              (analyze_function): Track always_executed.
>>              (modref_summaries::duplicate): Duplicate kills.
>>              (update_signature): Release kills.
>>              * ipa-modref.h (struct modref_summary): Add kills.
>>              * tree-ssa-alias.c (alias_stats): Add kill stats.
>>              (dump_alias_stats): Dump kill stats.
>>              (store_kills_ref_p): Break out from ...
>>              (stmt_kills_ref_p): Use it; handle modref info based kills.
>>
>>      gcc/testsuite/ChangeLog:
>>
>>      2021-11-14  Jan Hubicka  <hubicka@ucw.cz>
>>
>>              * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>>
>>
>> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
>> index 6fc2b7298f4..bbe23a5a211 100644
>> --- a/gcc/ipa-modref-tree.c
>> +++ b/gcc/ipa-modref-tree.c
>> @@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
>>     return true;
>>   }
>>
>> +/* Return true A is a subkill.  */
>> +bool
>> +modref_access_node::contains_for_kills (const modref_access_node &a) const
>> +{
>> +  poly_int64 aoffset_adj = 0;
>> +
>> +  gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
>> +                      && a.parm_index != MODREF_UNKNOWN_PARM);
>> +  if (parm_index != a.parm_index)
>> +    return false;
>> +  gcc_checking_assert (parm_offset_known && a.parm_offset_known);
>> +  aoffset_adj = (a.parm_offset - parm_offset)
>> +               * BITS_PER_UNIT;
>> +  gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
>> +  return known_subrange_p (a.offset + aoffset_adj,
>> +                          a.max_size, offset, max_size);
>> +}
>> +
>> +/* Merge two ranges both starting at parm_offset1 and update THIS
>> +   with result.  */
>> +bool
>> +modref_access_node::update_for_kills (poly_int64 parm_offset1,
>> +                                     poly_int64 offset1,
>> +                                     poly_int64 max_size1,
>> +                                     poly_int64 offset2,
>> +                                     poly_int64 max_size2,
>> +                                     bool record_adjustments)
>> +{
>> +  if (known_le (offset1, offset2))
>> +    ;
>> +  else if (known_le (offset2, offset1))
>> +    {
>> +      std::swap (offset1, offset2);
>> +      std::swap (max_size1, max_size2);
>> +    }
>> +  else
>> +    gcc_unreachable ();
>> +
>> +  poly_int64 new_max_size = max_size2 + offset2 - offset1;
>> +  if (known_le (new_max_size, max_size1))
>> +    new_max_size = max_size1;
>> +  if (known_eq (parm_offset, parm_offset1)
>> +      && known_eq (offset, offset1)
>> +      && known_eq (size, new_max_size)
>> +      && known_eq (max_size, new_max_size))
>> +    return false;
>> +
>> +  if (!record_adjustments
>> +      || (++adjustments) < param_modref_max_adjustments)
>> +    {
>> +      parm_offset = parm_offset1;
>> +      offset = offset1;
>> +      max_size = new_max_size;
>> +      size = new_max_size;
>> +      gcc_checking_assert (useful_for_kill_p ());
>> +      return true;
>> +    }
>> +  return false;
>> +}
>> +
>> +/* Merge in access A if it is possible to do without losing
>> +   precision.  Return true if successful.
>> +   Unlike merge assume that both accesses are always executed
>> +   and merge size the same was as max_size.  */
>> +bool
>> +modref_access_node::merge_for_kills (const modref_access_node &a,
>> +                                    bool record_adjustments)
>> +{
>> +  poly_int64 offset1 = 0;
>> +  poly_int64 aoffset1 = 0;
>> +  poly_int64 new_parm_offset = 0;
>> +
>> +  /* We assume that containment was tested earlier.  */
>> +  gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
>> +                      && useful_for_kill_p () && a.useful_for_kill_p ());
>> +
>> +  if (parm_index != a.parm_index
>> +      || !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
>> +    return false;
>> +
>> +  if (known_le (offset1, aoffset1))
>> +   {
>> +     if (!known_size_p (max_size)
>> +        || known_ge (offset1 + max_size, aoffset1))
>> +       return update_for_kills (new_parm_offset, offset1, max_size,
>> +                               aoffset1, a.max_size, record_adjustments);
>> +   }
>> +  else if (known_le (aoffset1, offset1))
>> +   {
>> +     if (!known_size_p (a.max_size)
>> +        || known_ge (aoffset1 + a.max_size, offset1))
>> +       return update_for_kills (new_parm_offset, offset1, max_size,
>> +                               aoffset1, a.max_size, record_adjustments);
>> +   }
>> +  return false;
>> +}
>> +
>> +/* Insert new kill A into KILLS.  If RECORD_ADJUSTMENTS is true limit number
>> +   of changes to each entry.  Return true if something changed.  */
>> +
>> +bool
>> +modref_access_node::insert_kill (vec<modref_access_node> &kills,
>> +                                modref_access_node &a, bool record_adjustments)
>> +{
>> +  size_t index;
>> +  modref_access_node *a2;
>> +  bool merge = false;
>> +
>> +  gcc_checking_assert (a.useful_for_kill_p ());
>> +
>> +  /* See if we have corresponding entry already or we can merge with
>> +     neighbouring entry.  */
>> +  FOR_EACH_VEC_ELT (kills, index, a2)
>> +    {
>> +      if (a2->contains_for_kills (a))
>> +       return false;
>> +      if (a.contains_for_kills (*a2))
>> +       {
>> +         a.adjustments = 0;
>> +         *a2 = a;
>> +         merge = true;
>> +         break;
>> +       }
>> +      if (a2->merge_for_kills (a, record_adjustments))
>> +       {
>> +         merge = true;
>> +         break;
>> +       }
>> +    }
>> +  /* If entry was not found, insert it.  */
>> +  if (!merge)
>> +    {
>> +      if ((int)kills.length () >= param_modref_max_accesses)
>> +       {
>> +         if (dump_file)
>> +           fprintf (dump_file,
>> +                    "--param param=modref-max-accesses limit reached:");
>> +         return false;
>> +       }
>> +      a.adjustments = 0;
>> +      kills.safe_push (a);
>> +      return true;
>> +    }
>> +  /* Extending range in an entry may make it possible to merge it with
>> +     other entries.  */
>> +  size_t i;
>> +
>> +  for (i = 0; i < kills.length ();)
>> +    if (i != index)
>> +      {
>> +       bool found = false, restart = false;
>> +       modref_access_node *a = &kills[i];
>> +       modref_access_node *n = &kills[index];
>> +
>> +       if (n->contains_for_kills (*a))
>> +         found = true;
>> +       if (!found && n->merge_for_kills (*a, false))
>> +         found = restart = true;
>> +       gcc_checking_assert (found || !a->merge_for_kills (*n, false));
>> +       if (found)
>> +         {
>> +           kills.unordered_remove (i);
>> +           if (index == kills.length ())
>> +             {
>> +               index = i;
>> +               i++;
>> +             }
>> +           if (restart)
>> +             i = 0;
>> +         }
>> +       else
>> +         i++;
>> +      }
>> +    else
>> +      i++;
>> +  return true;
>> +}
>> +
>> +
>>   #if CHECKING_P
>>
>>   namespace selftest {
>> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
>> index 2fcabe480bd..1bf2aa8460e 100644
>> --- a/gcc/ipa-modref-tree.h
>> +++ b/gcc/ipa-modref-tree.h
>> @@ -82,6 +82,13 @@ struct GTY(()) modref_access_node
>>       {
>>         return parm_index != MODREF_UNKNOWN_PARM;
>>       }
>> +  /* Return true if access can be used to determine a kill.  */
>> +  bool useful_for_kill_p () const
>> +    {
>> +      return parm_offset_known && parm_index != MODREF_UNKNOWN_PARM
>> +            && parm_index != MODREF_RETSLOT_PARM && known_size_p (size)
>> +            && known_eq (max_size, size);
>> +    }
>>     /* Dump range to debug OUT.  */
>>     void dump (FILE *out);
>>     /* Return true if both accesses are the same.  */
>> @@ -98,10 +105,18 @@ struct GTY(()) modref_access_node
>>     static int insert (vec <modref_access_node, va_gc> *&accesses,
>>                       modref_access_node a, size_t max_accesses,
>>                       bool record_adjustments);
>> +  /* Same as insert but for kills where we are conservative the other way
>> +     around: if information is lost, the kill is lost.  */
>> +  static bool insert_kill (vec<modref_access_node> &kills,
>> +                          modref_access_node &a, bool record_adjustments);
>>   private:
>>     bool contains (const modref_access_node &) const;
>> +  bool contains_for_kills (const modref_access_node &) const;
>>     void update (poly_int64, poly_int64, poly_int64, poly_int64, bool);
>> +  bool update_for_kills (poly_int64, poly_int64, poly_int64,
>> +                        poly_int64, poly_int64, bool);
>>     bool merge (const modref_access_node &, bool);
>> +  bool merge_for_kills (const modref_access_node &, bool);
>>     static bool closer_pair_p (const modref_access_node &,
>>                               const modref_access_node &,
>>                               const modref_access_node &,
>> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
>> index b75ed84135b..df4612bbff9 100644
>> --- a/gcc/ipa-modref.c
>> +++ b/gcc/ipa-modref.c
>> @@ -335,6 +335,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>>       return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>>     if (loads && !loads->every_base)
>>       return true;
>> +  else
>> +    kills.release ();
>>     if (ecf_flags & ECF_PURE)
>>       return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE));
>>     return stores && !stores->every_base;
>> @@ -351,6 +353,7 @@ struct GTY(()) modref_summary_lto
>>        more verbose and thus more likely to hit the limits.  */
>>     modref_records_lto *loads;
>>     modref_records_lto *stores;
>> +  auto_vec<modref_access_node> GTY((skip)) kills;
>>     auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>>     eaf_flags_t retslot_flags;
>>     eaf_flags_t static_chain_flags;
>> @@ -570,6 +573,15 @@ modref_summary::dump (FILE *out)
>>         fprintf (out, "  stores:\n");
>>         dump_records (stores, out);
>>       }
>> +  if (kills.length ())
>> +    {
>> +      fprintf (out, "  kills:\n");
>> +      for (auto kill : kills)
>> +       {
>> +         fprintf (out, "    ");
>> +         kill.dump (out);
>> +       }
>> +    }
>>     if (writes_errno)
>>       fprintf (out, "  Writes errno\n");
>>     if (side_effects)
>> @@ -762,13 +774,12 @@ get_access (ao_ref *ref)
>>   /* Record access into the modref_records data structure.  */
>>
>>   static void
>> -record_access (modref_records *tt, ao_ref *ref)
>> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>>   {
>>     alias_set_type base_set = !flag_strict_aliasing ? 0
>>                              : ao_ref_base_alias_set (ref);
>>     alias_set_type ref_set = !flag_strict_aliasing ? 0
>>                              : (ao_ref_alias_set (ref));
>> -  modref_access_node a = get_access (ref);
>>     if (dump_file)
>>       {
>>          fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
>> @@ -781,7 +792,7 @@ record_access (modref_records *tt, ao_ref *ref)
>>   /* IPA version of record_access_tree.  */
>>
>>   static void
>> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
>> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node &a)
>>   {
>>     /* get_alias_set sometimes use different type to compute the alias set
>>        than TREE_TYPE (base).  Do same adjustments.  */
>> @@ -828,7 +839,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>>                         || variably_modified_type_p (ref_type, NULL_TREE)))
>>          ref_type = NULL_TREE;
>>       }
>> -  modref_access_node a = get_access (ref);
>>     if (dump_file)
>>       {
>>         fprintf (dump_file, "   - Recording base type:");
>> @@ -932,13 +942,17 @@ bool
>>   merge_call_side_effects (modref_summary *cur_summary,
>>                           gimple *stmt, modref_summary *callee_summary,
>>                           bool ignore_stores, cgraph_node *callee_node,
>> -                        bool record_adjustments)
>> +                        bool record_adjustments, bool always_executed)
>>   {
>>     auto_vec <modref_parm_map, 32> parm_map;
>>     modref_parm_map chain_map;
>>     bool changed = false;
>>     int flags = gimple_call_flags (stmt);
>>
>> +  if ((flags & (ECF_CONST | ECF_NOVOPS))
>> +      && !(flags & ECF_LOOPING_CONST_OR_PURE))
>> +    return changed;
>> +
>>     if (!cur_summary->side_effects && callee_summary->side_effects)
>>       {
>>         if (dump_file)
>> @@ -950,6 +964,38 @@ merge_call_side_effects (modref_summary *cur_summary,
>>     if (flags & (ECF_CONST | ECF_NOVOPS))
>>       return changed;
>>
>> +  if (always_executed
>> +      && callee_summary->kills.length ()
>> +      && (!cfun->can_throw_non_call_exceptions
>> +         || !stmt_could_throw_p (cfun, stmt)))
>> +    {
>> +      /* Watch for self recursive updates.  */
>> +      auto_vec<modref_access_node, 32> saved_kills;
>> +
>> +      saved_kills.reserve_exact (callee_summary->kills.length ());
>> +      saved_kills.splice (callee_summary->kills);
>> +      for (auto kill : saved_kills)
>> +       {
>> +         if (kill.parm_index >= (int)parm_map.length ())
>> +           continue;
>> +         modref_parm_map &m
>> +                 = kill.parm_index == MODREF_STATIC_CHAIN_PARM
>> +                   ? chain_map
>                       ^^^^^^^^^^^^^^^^  chain_map  isn't initialized.
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
Yup.  It's causing heartburn in various ways in the tester.  I was just 
tracking it down with valgrind...
jeff



More information about the Gcc-patches mailing list