[google] pessimize stack accounting during inlining

Richard Guenther richard.guenther@gmail.com
Wed Jun 8 09:11:00 GMT 2011


On Wed, Jun 8, 2011 at 1:36 AM, Xinliang David Li <davidxl@google.com> wrote:
> Ok for google/main.   A good candidate patch for trunk too.

Well, it's still not a hard limit as we can't tell how many spill slots
or extra call argument or return value slots we need.

Richard.

> Thanks,
>
> David
>
> On Tue, Jun 7, 2011 at 4:29 PM, Mark Heffernan <meheff@google.com> wrote:
>> This patch pessimizes stack accounting during inlining.  This enables
>> setting a firm stack size limit (via parameters "large-stack-frame" and
>> "large-stack-frame-growth").  Without this patch the inliner is overly
>> optimistic about potential stack reuse resulting in actual stack frames much
>> larger than the parameterized limits.
>> Internal benchmarks show minor performance differences with non-fdo and
>> lipo, but overall neutral.  Tested/bootstrapped on x86-64.
>> Ok for google-main?
>> Mark
>>
>> 2011-06-07  Mark Heffernan  <meheff@google.com>
>>         * cgraph.h (cgraph_global_info): Remove field.
>>         * ipa-inline.c (cgraph_clone_inlined_nodes): Change
>>         stack frame computation.
>>         (cgraph_check_inline_limits): Ditto.
>>         (compute_inline_parameters): Remove dead initialization.
>>
>> Index: gcc/cgraph.h
>> ===================================================================
>> --- gcc/cgraph.h        (revision 174512)
>> +++ gcc/cgraph.h        (working copy)
>> @@ -136,8 +136,6 @@ struct GTY(()) cgraph_local_info {
>>  struct GTY(()) cgraph_global_info {
>>    /* Estimated stack frame consumption by the function.  */
>>    HOST_WIDE_INT estimated_stack_size;
>> -  /* Expected offset of the stack frame of inlined function.  */
>> -  HOST_WIDE_INT stack_frame_offset;
>>
>>    /* For inline clones this points to the function they will be
>>       inlined into.  */
>> Index: gcc/ipa-inline.c
>> ===================================================================
>> --- gcc/ipa-inline.c    (revision 174512)
>> +++ gcc/ipa-inline.c    (working copy)
>> @@ -229,8 +229,6 @@ void
>>  cgraph_clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
>>                             bool update_original)
>>  {
>> -  HOST_WIDE_INT peak;
>> -
>>    if (duplicate)
>>      {
>>        /* We may eliminate the need for out-of-line copy to be output.
>> @@ -279,13 +277,13 @@ cgraph_clone_inlined_nodes (struct cgrap
>>      e->callee->global.inlined_to = e->caller->global.inlined_to;
>>    else
>>      e->callee->global.inlined_to = e->caller;
>> -  e->callee->global.stack_frame_offset
>> -    = e->caller->global.stack_frame_offset
>> -      + inline_summary (e->caller)->estimated_self_stack_size;
>> -  peak = e->callee->global.stack_frame_offset
>> -      + inline_summary (e->callee)->estimated_self_stack_size;
>> -  if (e->callee->global.inlined_to->global.estimated_stack_size < peak)
>> -    e->callee->global.inlined_to->global.estimated_stack_size = peak;
>> +
>> +  /* Pessimistically assume no sharing of stack space.  That is, the
>> +     frame size of a function is estimated as the original frame size
>> +     plus the sum of the frame sizes of all inlined callees.  */
>> +  e->callee->global.inlined_to->global.estimated_stack_size +=
>> +    inline_summary (e->callee)->estimated_self_stack_size;
>> +
>>    cgraph_propagate_frequency (e->callee);
>>
>>    /* Recursively clone all bodies.  */
>> @@ -430,8 +428,7 @@ cgraph_check_inline_limits (struct cgrap
>>
>>    stack_size_limit += stack_size_limit * PARAM_VALUE
>> (PARAM_STACK_FRAME_GROWTH) / 100;
>>
>> -  inlined_stack = (to->global.stack_frame_offset
>> -                  + inline_summary (to)->estimated_self_stack_size
>> +  inlined_stack = (to->global.estimated_stack_size
>>                    + what->global.estimated_stack_size);
>>    if (inlined_stack  > stack_size_limit
>>        && inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
>> @@ -2064,7 +2061,6 @@ compute_inline_parameters (struct cgraph
>>    self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
>>    inline_summary (node)->estimated_self_stack_size = self_stack_size;
>>    node->global.estimated_stack_size = self_stack_size;
>> -  node->global.stack_frame_offset = 0;
>>
>>    /* Can this function be inlined at all?  */
>>    node->local.inlinable = tree_inlinable_function_p (node->decl);
>>
>



More information about the Gcc-patches mailing list