[patch] PR38584, inline heuristics run quadratic stack var packing
Richard Guenther
richard.guenther@gmail.com
Sat Jan 3 14:58:00 GMT 2009
On Sat, Jan 3, 2009 at 3:10 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hi,
>
> The attached patch simplifies the stack size estimate for the inline heuristics.
>
> The current estimate runs rth's stack var binpacking algorithm, which
> is quadratic in the number of stack vars. And it's not run at least
> twice per function, but more often if inlining happens (I don't know
> why, but I hope Honza's pretty-ipa stuff cleans this up...).
>
> The new calculation is simply the sum of all stack vars ignoring
> packing opportunities. This may sound pessimistic, but actually it
> makes almost no difference, because stack vars packing doesn't happen
> very often, and because a large stack frame will still be large even
> when some stack vars can be packed.
>
> I measured the difference between the old and new estimate for the
> 1782 functions with stack variables in my cc1-i files (out of >10000
> functions). The old and new heuristic differ only for 83 functions,
> and in all cases the new heuristic is less than twice the predicted
> packed stack frame size. In short, don't think this change will have
> any negative effect on the accuracy of this heuristic.
>
> Bootstrapped&tested on ia64-linux.
> I've compared the cc1-i assembly files for ia64 and for x86-64
> with/without patch. They are all the same.
>
> OK for trunk?
Ok.
Thanks.
Richard.
> Gr.
> Steven
>
> PR middle-end/38584
> * cfgexpand.c (estimated_stack_frame_size): Simplify the estimate:
> Calculate the size of all stack vars assuming no packing of stack
> vars will happen, replacing a quadratic algorithm with a linear one.
>
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 142994)
> +++ cfgexpand.c (working copy)
> @@ -1392,16 +1392,23 @@ fini_vars_expansion (void)
> stack_vars_conflict_alloc = 0;
> }
>
> +/* Make a fair guess for the size of the stack frame of the current
> + function. This doesn't have to be exact, the result is only used
> + in the inline heuristics. So we don't want to run the full stack
> + var packing algorithm (which is quadratic in the number of stack
> + vars). Instead, we calculate the total size of all stack vars.
> + This turns out to be a pretty fair estimate -- packing of stack
> + vars doesn't happen very often. */
> +
> HOST_WIDE_INT
> estimated_stack_frame_size (void)
> {
> HOST_WIDE_INT size = 0;
> + size_t i;
> tree t, outer_block = DECL_INITIAL (current_function_decl);
>
> init_vars_expansion ();
>
> - /* At this point all variables on the local_decls with TREE_USED
> - set are not associated with any block scope. Lay them out. */
> for (t = cfun->local_decls; t; t = TREE_CHAIN (t))
> {
> tree var = TREE_VALUE (t);
> @@ -1411,27 +1418,14 @@ estimated_stack_frame_size (void)
> TREE_USED (var) = 1;
> }
> size += account_used_vars_for_block (outer_block, true);
> +
> if (stack_vars_num > 0)
> {
> - /* Due to the way alias sets work, no variables with non-conflicting
> - alias sets may be assigned the same address. Add conflicts to
> - reflect this. */
> - add_alias_set_conflicts ();
> -
> - /* If stack protection is enabled, we don't share space between
> - vulnerable data and non-vulnerable data. */
> - if (flag_stack_protect)
> - add_stack_protection_conflicts ();
> -
> - /* Now that we have collected all stack variables, and have computed a
> - minimal interference graph, attempt to save some stack space. */
> - partition_stack_vars ();
> - if (dump_file)
> - dump_stack_var_partition ();
> -
> - size += account_stack_vars ();
> + for (i = 0; i < stack_vars_num; ++i)
> + size += stack_vars[i].size;
> fini_vars_expansion ();
> }
> +
> return size;
> }
>
More information about the Gcc-patches
mailing list