This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR c++/70452 (regression in C++ parsing performance)
- From: Patrick Palka <patrick at parcs dot ath dot cx>
- To: Trevor Saunders <tbsaunde at tbsaunde dot org>
- Cc: Jason Merrill <jason at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sun, 3 Apr 2016 18:06:05 -0400
- Subject: Re: [PATCH] Fix PR c++/70452 (regression in C++ parsing performance)
- Authentication-results: sourceware.org; auth=none
- References: <1459538004-13812-1-git-send-email-patrick at parcs dot ath dot cx> <56FF202C dot 70103 at redhat dot com> <alpine dot DEB dot 2 dot 20 dot 11 dot 1604021715510 dot 1308 at idea> <20160403035729 dot GF10837 at ball>
On Sat, Apr 2, 2016 at 11:57 PM, Trevor Saunders <tbsaunde@tbsaunde.org> wrote:
> On Sat, Apr 02, 2016 at 05:18:31PM -0400, Patrick Palka wrote:
>> On Fri, 1 Apr 2016, Jason Merrill wrote:
>>
>> > I like this approach a lot. One thing, though:
>> >
>> > On 04/01/2016 03:13 PM, Patrick Palka wrote:
>> > > +struct GTY((chain_next ("%h.prev"))) bpr_entry
>> > > +{
>> > > + tree body;
>> > > + tree parms;
>> > > + tree res;
>> > > + struct bpr_entry *prev;
>> > > +};
>> > > +
>> > > /* Representation of entries in the constexpr function definition table.
>> > > */
>> > >
>> > > struct GTY((for_user)) constexpr_fundef {
>> > > tree decl;
>> > > tree body;
>> > > + /* A chain of unused copies of this function's body awaiting reuse for
>> > > + CALL_EXPR evaluation. */
>> > > + struct bpr_entry *bpr_freelist;
>> > > };
>> >
>> > The freelist should be discarded on GC. I think the way to achieve that is to
>> > put all the freelists into a separate deletable hash table rather than hang
>> > them off the fundefs.
>> >
>> > Jason
>> >
>> >
>>
>> Here's a version that uses a separate deletable table to cache the
>> function copies. For simplicity I used a hash_map instead of a
>> hash_table. Does this look OK to commit after bootstrap + regtest?
>>
>> On a related note, I noticed that the constexpr_call_table is not marked
>> deletable. Marking it deletable speeds up the test case in the PR by
>> about 10% and saves about 10MB. Do you think doing so is a good idea?
>>
>> On another related note, I noticed that marking something as both
>> GTY((deletable, cache)) doesn't work as intended, because the
>> gt_cleare_cache functions run _after_ all deletable roots get
>> zeroed out. So during GC the gt_cleare_cache function of a root
>> marked "deletable, cache" would always be a no-op. Concretely I think
>> this means that our cv_cache and fold_cache leak memory during GC
>> because their underlying hash_map (allocated by operator new) is zeroed
>> before gc_cleare_cache could clear it.
>
> yeah, I think that's true. I'm not really sure what the expected point
> of making something both cache and deletable is maybe we should just ban
> it? The way this ties in with the gc is... wierd ;) but I'm not really
> sure how I'd fix it.
>
> Trev
The intended effect can be achieved I think by marking the cache_map
as GTY((cache)) (so that the table can be cleared during GC) and then
marking the cache_map::map field as GTY((skip)) (so that PCH ignores
it).
>
>>
>> gcc/cp/ChangeLog:
>>
>> PR c++/70452
>> * constexpr.c (struct fun_copy): New struct.
>> (struct fundef_copies_t): New struct.
>> (fundef_copies_table): New static variable.
>> (maybe_initialize_fundef_copies_table): New static function.
>> (get_fun_copy): New static function.
>> (save_fun_copy): New static function.
>> (cxx_eval_call_expression): Use
>> maybe_initialize_fundef_copies_table, get_fun_copy, and
>> save_fun_copy.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR c++/70452
>> * g++.dg/ext/constexpr-vla4.C: New test.
>> ---
>> gcc/cp/constexpr.c | 96 +++++++++++++++++++++++++++++--
>> gcc/testsuite/g++.dg/ext/constexpr-vla4.C | 17 ++++++
>> 2 files changed, 109 insertions(+), 4 deletions(-)
>> create mode 100644 gcc/testsuite/g++.dg/ext/constexpr-vla4.C
>>
>> diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
>> index ea605dc..e1a5a4e 100644
>> --- a/gcc/cp/constexpr.c
>> +++ b/gcc/cp/constexpr.c
>> @@ -965,6 +965,76 @@ maybe_initialize_constexpr_call_table (void)
>> constexpr_call_table = hash_table<constexpr_call_hasher>::create_ggc (101);
>> }
>>
>> +/* The representation of a single node in the per-function freelist maintained
>> + by FUNDEF_COPIES_TABLE. */
>> +
>> +struct fun_copy
>> +{
>> + tree body;
>> + tree parms;
>> + tree res;
>> + fun_copy *prev;
>> +};
>> +
>> +/* During constexpr CALL_EXPR evaluation, to avoid issues with sharing when
>> + a function happens to get called recursively, we unshare the callee
>> + function's body and evaluate this unshared copy instead of evaluating the
>> + original body.
>> +
>> + FUNDEF_COPIES_TABLE is a per-function freelist of these unshared function
>> + copies. The underlying data structure of FUNDEF_COPIES_TABLE is a hash_map
>> + that's keyed off of the original FUNCTION_DECL and whose value is the chain
>> + of this function's unused copies awaiting reuse. */
>> +
>> +struct fundef_copies_table_t
>> +{
>> + hash_map<tree, fun_copy *> *map;
>> +};
>> +
>> +static GTY((deletable)) fundef_copies_table_t fundef_copies_table;
>> +
>> +/* Initialize FUNDEF_COPIES_TABLE if it's not initialized. */
>> +
>> +static void
>> +maybe_initialize_fundef_copies_table ()
>> +{
>> + if (fundef_copies_table.map == NULL)
>> + fundef_copies_table.map = hash_map<tree, fun_copy *>::create_ggc (101);
>> +}
>> +
>> +/* Reuse or create a new unshared copy of the function FUN.
>> + Return this copy. */
>> +
>> +static fun_copy *
>> +get_fun_copy (tree fun)
>> +{
>> + fun_copy *copy;
>> + fun_copy **slot = &fundef_copies_table.map->get_or_insert (fun, NULL);
>> + if (*slot == NULL)
>> + {
>> + copy = ggc_alloc<fun_copy> ();
>> + copy->body = copy_fn (fun, copy->parms, copy->res);
>> + copy->prev = NULL;
>> + }
>> + else
>> + {
>> + copy = *slot;
>> + *slot = (*slot)->prev;
>> + }
>> +
>> + return copy;
>> +}
>> +
>> +/* Save the copy COPY of function FUN for later reuse by get_fun_copy(). */
>> +
>> +static void
>> +save_fun_copy (tree fun, fun_copy *copy)
>> +{
>> + fun_copy **slot = &fundef_copies_table.map->get_or_insert (fun, NULL);
>> + copy->prev = *slot;
>> + *slot = copy;
>> +}
>> +
>> /* We have an expression tree T that represents a call, either CALL_EXPR
>> or AGGR_INIT_EXPR. If the call is lexically to a named function,
>> retrun the _DECL for that function. */
>> @@ -1377,10 +1447,14 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
>> if (!result || result == error_mark_node)
>> {
>> gcc_assert (DECL_SAVED_TREE (fun));
>> - tree parms, res;
>> + tree body, parms, res;
>>
>> - /* Unshare the whole function body. */
>> - tree body = copy_fn (fun, parms, res);
>> + /* Reuse or create a new unshared copy of this function's body. */
>> + maybe_initialize_fundef_copies_table ();
>> + fun_copy *copy = get_fun_copy (fun);
>> + body = copy->body;
>> + parms = copy->parms;
>> + res = copy->res;
>>
>> /* Associate the bindings with the remapped parms. */
>> tree bound = new_call.bindings;
>> @@ -1409,8 +1483,14 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
>> else
>> ctx->values->put (res, NULL_TREE);
>>
>> + /* Track the callee's evaluated SAVE_EXPRs so that we can forget
>> + their values after the call. */
>> + constexpr_ctx ctx_with_save_exprs = *ctx;
>> + hash_set<tree> save_exprs;
>> + ctx_with_save_exprs.save_exprs = &save_exprs;
>> +
>> tree jump_target = NULL_TREE;
>> - cxx_eval_constant_expression (ctx, body,
>> + cxx_eval_constant_expression (&ctx_with_save_exprs, body,
>> lval, non_constant_p, overflow_p,
>> &jump_target);
>>
>> @@ -1435,6 +1515,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
>> }
>> }
>>
>> + /* Forget the saved values of the callee's SAVE_EXPRs. */
>> + for (hash_set<tree>::iterator iter = save_exprs.begin();
>> + iter != save_exprs.end(); ++iter)
>> + ctx_with_save_exprs.values->remove (*iter);
>> +
>> /* Remove the parms/result from the values map. Is it worth
>> bothering to do this when the map itself is only live for
>> one constexpr evaluation? If so, maybe also clear out
>> @@ -1444,6 +1529,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
>> ctx->values->remove (slot);
>> for (tree parm = parms; parm; parm = TREE_CHAIN (parm))
>> ctx->values->remove (parm);
>> +
>> + /* Make the unshared function copy we used available for re-use. */
>> + save_fun_copy (fun, copy);
>> }
>>
>> if (result == error_mark_node)
>> diff --git a/gcc/testsuite/g++.dg/ext/constexpr-vla4.C b/gcc/testsuite/g++.dg/ext/constexpr-vla4.C
>> new file mode 100644
>> index 0000000..428a8fd
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/ext/constexpr-vla4.C
>> @@ -0,0 +1,17 @@
>> +// PR c++/70452
>> +// { dg-do compile { target c++14 } }
>> +
>> +constexpr int
>> +foo (int n, bool p)
>> +{
>> + __extension__ int a [n] = { 0 };
>> + if (n == 3)
>> + foo (n - 2, false);
>> + if (n == 3)
>> + foo(n - 1, true);
>> + if (p)
>> + return a[1];
>> + return 0;
>> +}
>> +
>> +constexpr int i2 = foo (3, false); // { dg-bogus "array subscript out of bound" }
>> --
>> 2.8.0.rc3.27.gade0865
>>