This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] propagate malloc attribute in ipa-pure-const pass


On 23 May 2017 at 19:10, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 19 May 2017 at 19:02, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>
>>> * LTO and memory management
>>> This is a general question about LTO and memory management.
>>> IIUC the following sequence takes place during normal LTO:
>>> LGEN: generate_summary, write_summary
>>> WPA: read_summary, execute ipa passes, write_opt_summary
>>>
>>> So I assumed it was OK in LGEN to allocate return_callees_map in
>>> generate_summary and free it in write_summary and during WPA, allocate
>>> return_callees_map in read_summary and free it after execute (since
>>> write_opt_summary does not require return_callees_map).
>>>
>>> However with fat LTO, it seems the sequence changes for LGEN with
>>> execute phase takes place after write_summary. However since
>>> return_callees_map is freed in pure_const_write_summary and
>>> propagate_malloc() accesses it in execute stage, it results in
>>> segmentation fault.
>>>
>>> To work around this, I am using the following hack in pure_const_write_summary:
>>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>>> if (!global_options.x_flag_fat_lto_objects)
>>>   free_return_callees_map ();
>>> Is there a better approach for handling this ?
>>
>> I think most passes just do not free summaries with -flto.  We probably want
>> to fix it to make it possible to compile multiple units i.e. from plugin by
>> adding release_summaries method...
>> So I would say it is OK to do the same as others do and leak it with -flto.
>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>>> index e457166ea39..724c26e03f6 100644
>>> --- a/gcc/ipa-pure-const.c
>>> +++ b/gcc/ipa-pure-const.c
>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "tree-scalar-evolution.h"
>>>  #include "intl.h"
>>>  #include "opts.h"
>>> +#include "ssa.h"
>>>
>>>  /* Lattice values for const and pure functions.  Everything starts out
>>>     being const, then may drop to pure and then neither depending on
>>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>>
>>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>>
>>> +enum malloc_state_e
>>> +{
>>> +  PURE_CONST_MALLOC_TOP,
>>> +  PURE_CONST_MALLOC,
>>> +  PURE_CONST_MALLOC_BOTTOM
>>> +};
>>
>> It took me a while to work out what PURE_CONST means here :)
>> I would just call it something like STATE_MALLOC_TOP... or so.
>> ipa_pure_const is outdated name from the time pass was doing only
>> those two.
>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>>
>>>  static vec<funct_state> funct_state_vec;
>>>
>>> +/* A map from node to subset of callees. The subset contains those callees
>>> + * whose return-value is returned by the node. */
>>> +static hash_map< cgraph_node *, vec<cgraph_node *>* > *return_callees_map;
>>> +
>>
>> Hehe, a special case of return jump function.  We ought to support those more generally.
>> How do you keep it up to date over callgraph changes?
>>> @@ -921,6 +1055,23 @@ end:
>>>    if (TREE_NOTHROW (decl))
>>>      l->can_throw = false;
>>>
>>> +  if (ipa)
>>> +    {
>>> +      vec<cgraph_node *> v = vNULL;
>>> +      l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>>> +      if (DECL_IS_MALLOC (decl))
>>> +     l->malloc_state = PURE_CONST_MALLOC;
>>> +      else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>>> +     {
>>> +       l->malloc_state = PURE_CONST_MALLOC_TOP;
>>> +       vec<cgraph_node *> *callees_p = new vec<cgraph_node *> (vNULL);
>>> +       for (unsigned i = 0; i < v.length (); ++i)
>>> +         callees_p->safe_push (v[i]);
>>> +       return_callees_map->put (fn, callees_p);
>>> +     }
>>> +      v.release ();
>>> +    }
>>> +
>>
>> I would do non-ipa variant, too.  I think most attributes can be detected that way
>> as well.
>>
>> The patch generally makes sense to me.  It would be nice to make it easier to write such
>> a basic propagators across callgraph (perhaps adding a template doing the basic
>> propagation logic). Also I think you need to solve the problem with keeping your
>> summaries up to date across callgraph node removal and duplications.
> Thanks for the suggestions, I will try to address them in a follow-up patch.
> IIUC, I would need to modify ipa-pure-const cgraph hooks -
> add_new_function, remove_node_data, duplicate_node_data
> to keep return_callees_map up-to-date across callgraph node insertions
> and removal ?
>
> Also, if instead of having a separate data-structure like return_callees_map,
> should we rather have a flag within cgraph_edge, which marks that the
> caller may return the value of the callee ?
Hi,
Sorry for the very late response. I have attached an updated version
of the prototype patch,
which adds a non-ipa variant, and keeps return_callees_map up-to-date
across callgraph
node insertions and removal. For the non-ipa variant,
malloc_candidate_p() additionally checks
that all the "return callees" have DECL_IS_MALLOC set to true.
Bootstrapped+tested and LTO bootstrapped+tested on x86_64-unknown-linux-gnu.
Does it look OK so far ?

Um sorry for this silly question, but I don't really understand how
does indirect call propagation
work in ipa-pure-const ? For example consider propagation of nothrow
attribute in following
test-case:

__attribute__((noinline, noclone, nothrow))
int f1(int k) { return k; }

__attribute__((noinline, noclone))
static int foo(int (*p)(int))
{
  return p(10);
}

__attribute__((noinline, noclone))
int bar(void)
{
  return foo(f1);
}

Shouldn't foo and bar be also marked as nothrow ?
Since foo indirectly calls f1 which is nothrow and bar only calls foo ?
The local-pure-const2 dump shows function is locally throwing  for
"foo" and "bar".

Um, I was wondering how to get "points-to" analysis for function-pointers,
to get list of callees that may be indirectly called from that
function pointer ?
In the patch I just set node to bottom if it contains indirect calls
which is far from ideal :(
I would be much grateful for suggestions on how to handle indirect calls.
Thanks!

Regards,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> Honza

Attachment: malloc-prop-0_16.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]