This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] propagate malloc attribute in ipa-pure-const pass
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Prathamesh Kulkarni <prathamesh dot kulkarni at linaro dot org>
- Cc: gcc Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Date: Fri, 19 May 2017 15:32:12 +0200
- Subject: Re: [RFC] propagate malloc attribute in ipa-pure-const pass
- Authentication-results: sourceware.org; auth=none
- References: <CAAgBjMkaTCAvwO6AVKu5C6_-TqpuQtTcs0EqYHqogDhru3juYg@mail.gmail.com>
>
> * LTO and memory management
> This is a general question about LTO and memory management.
> IIUC the following sequence takes place during normal LTO:
> LGEN: generate_summary, write_summary
> WPA: read_summary, execute ipa passes, write_opt_summary
>
> So I assumed it was OK in LGEN to allocate return_callees_map in
> generate_summary and free it in write_summary and during WPA, allocate
> return_callees_map in read_summary and free it after execute (since
> write_opt_summary does not require return_callees_map).
>
> However with fat LTO, it seems the sequence changes for LGEN with
> execute phase takes place after write_summary. However since
> return_callees_map is freed in pure_const_write_summary and
> propagate_malloc() accesses it in execute stage, it results in
> segmentation fault.
>
> To work around this, I am using the following hack in pure_const_write_summary:
> // FIXME: Do not free if -ffat-lto-objects is enabled.
> if (!global_options.x_flag_fat_lto_objects)
> free_return_callees_map ();
> Is there a better approach for handling this ?
I think most passes just do not free summaries with -flto. We probably want
to fix it to make it possible to compile multiple units i.e. from plugin by
adding release_summaries method...
So I would say it is OK to do the same as others do and leak it with -flto.
> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
> index e457166ea39..724c26e03f6 100644
> --- a/gcc/ipa-pure-const.c
> +++ b/gcc/ipa-pure-const.c
> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-scalar-evolution.h"
> #include "intl.h"
> #include "opts.h"
> +#include "ssa.h"
>
> /* Lattice values for const and pure functions. Everything starts out
> being const, then may drop to pure and then neither depending on
> @@ -69,6 +70,15 @@ enum pure_const_state_e
>
> const char *pure_const_names[3] = {"const", "pure", "neither"};
>
> +enum malloc_state_e
> +{
> + PURE_CONST_MALLOC_TOP,
> + PURE_CONST_MALLOC,
> + PURE_CONST_MALLOC_BOTTOM
> +};
It took me a while to work out what PURE_CONST means here :)
I would just call it something like STATE_MALLOC_TOP... or so.
ipa_pure_const is outdated name from the time pass was doing only
those two.
> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>
> static vec<funct_state> funct_state_vec;
>
> +/* A map from node to subset of callees. The subset contains those callees
> + * whose return-value is returned by the node. */
> +static hash_map< cgraph_node *, vec<cgraph_node *>* > *return_callees_map;
> +
Hehe, a special case of return jump function. We ought to support those more generally.
How do you keep it up to date over callgraph changes?
> @@ -921,6 +1055,23 @@ end:
> if (TREE_NOTHROW (decl))
> l->can_throw = false;
>
> + if (ipa)
> + {
> + vec<cgraph_node *> v = vNULL;
> + l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
> + if (DECL_IS_MALLOC (decl))
> + l->malloc_state = PURE_CONST_MALLOC;
> + else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
> + {
> + l->malloc_state = PURE_CONST_MALLOC_TOP;
> + vec<cgraph_node *> *callees_p = new vec<cgraph_node *> (vNULL);
> + for (unsigned i = 0; i < v.length (); ++i)
> + callees_p->safe_push (v[i]);
> + return_callees_map->put (fn, callees_p);
> + }
> + v.release ();
> + }
> +
I would do non-ipa variant, too. I think most attributes can be detected that way
as well.
The patch generally makes sense to me. It would be nice to make it easier to write such
a basic propagators across callgraph (perhaps adding a template doing the basic
propagation logic). Also I think you need to solve the problem with keeping your
summaries up to date across callgraph node removal and duplications.
Honza