[patch] PR38584, inline heuristics run quadratic stack var packing

Richard Guenther richard.guenther@gmail.com
Sat Jan 3 14:58:00 GMT 2009


On Sat, Jan 3, 2009 at 3:10 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hi,
>
> The attached patch simplifies the stack size estimate for the inline heuristics.
>
> The current estimate runs rth's stack var binpacking algorithm, which
> is quadratic in the number of stack vars.  And it's not run at least
> twice per function, but more often if inlining happens (I don't know
> why, but I hope Honza's pretty-ipa stuff cleans this up...).
>
> The new calculation is simply the sum of all stack vars ignoring
> packing opportunities. This may sound pessimistic, but actually it
> makes almost no difference, because stack vars packing doesn't happen
> very often, and because a large stack frame will still be large even
> when some stack vars can be packed.
>
> I measured the difference between the old and new estimate for the
> 1782 functions with stack variables in my cc1-i files (out of >10000
> functions). The old and new heuristic differ only for 83 functions,
> and in all cases the new heuristic is less than twice the predicted
> packed stack frame size.  In short, don't think this change will have
> any negative effect on the accuracy of this heuristic.
>
> Bootstrapped&tested on ia64-linux.
> I've compared the cc1-i assembly files for ia64 and for x86-64
> with/without patch.  They are all the same.
>
> OK for trunk?

Ok.

Thanks.
Richard.

> Gr.
> Steven
>
>        PR middle-end/38584
>        * cfgexpand.c (estimated_stack_frame_size): Simplify the estimate:
>        Calculate the size of all stack vars assuming no packing of stack
>        vars will happen, replacing a quadratic algorithm with a linear one.
>
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c (revision 142994)
> +++ cfgexpand.c (working copy)
> @@ -1392,16 +1392,23 @@ fini_vars_expansion (void)
>   stack_vars_conflict_alloc = 0;
>  }
>
> +/* Make a fair guess for the size of the stack frame of the current
> +   function.  This doesn't have to be exact, the result is only used
> +   in the inline heuristics.  So we don't want to run the full stack
> +   var packing algorithm (which is quadratic in the number of stack
> +   vars).  Instead, we calculate the total size of all stack vars.
> +   This turns out to be a pretty fair estimate -- packing of stack
> +   vars doesn't happen very often.  */
> +
>  HOST_WIDE_INT
>  estimated_stack_frame_size (void)
>  {
>   HOST_WIDE_INT size = 0;
> +  size_t i;
>   tree t, outer_block = DECL_INITIAL (current_function_decl);
>
>   init_vars_expansion ();
>
> -  /* At this point all variables on the local_decls with TREE_USED
> -     set are not associated with any block scope.  Lay them out.  */
>   for (t = cfun->local_decls; t; t = TREE_CHAIN (t))
>     {
>       tree var = TREE_VALUE (t);
> @@ -1411,27 +1418,14 @@ estimated_stack_frame_size (void)
>       TREE_USED (var) = 1;
>     }
>   size += account_used_vars_for_block (outer_block, true);
> +
>   if (stack_vars_num > 0)
>     {
> -      /* Due to the way alias sets work, no variables with non-conflicting
> -        alias sets may be assigned the same address.  Add conflicts to
> -        reflect this.  */
> -      add_alias_set_conflicts ();
> -
> -      /* If stack protection is enabled, we don't share space between
> -        vulnerable data and non-vulnerable data.  */
> -      if (flag_stack_protect)
> -       add_stack_protection_conflicts ();
> -
> -      /* Now that we have collected all stack variables, and have computed a
> -        minimal interference graph, attempt to save some stack space.  */
> -      partition_stack_vars ();
> -      if (dump_file)
> -       dump_stack_var_partition ();
> -
> -      size += account_stack_vars ();
> +      for (i = 0; i < stack_vars_num; ++i)
> +       size += stack_vars[i].size;
>       fini_vars_expansion ();
>     }
> +
>   return size;
>  }
>



More information about the Gcc-patches mailing list