This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR37448 (and dups?)


On Fri, 19 Feb 2016, Jan Hubicka wrote:

> > > 
> > > Are you sure?  I thought we compute that up-front ... at least
> > > we make sure we can_inline_edge_p all calls before even trying
> > > to inline all calls - that looks somewhat redundant then if it
> > > can happen that we give up anyway.  Ah, so can_inline_edge_p has
> > > 
> > >   /* Check if caller growth allows the inlining.  */
> > >   else if (!DECL_DISREGARD_INLINE_LIMITS (callee->decl)
> > >            && !disregard_limits
> > >            && !lookup_attribute ("flatten",
> > >                                  DECL_ATTRIBUTES (caller->decl))
> > >            && !caller_growth_limits (e))
> > >     inlinable = false;
> > > 
> > > and we check that from when we do the inlining and upfront from
> 
> > > want_inline_function_to_all_callers_p ... we also give up
> > > if we inline a recursive function it seems.
> Yep, there is early check for recursion and sizes. But you can still hit the
> size limits later. If a function A calls 1000 times function B, each of the
> calls individually may be inlinable, but not all together.
> > > 
> > > > For next stage1 I plan to finish the overahaul to sreals that will let
> > > > me to make update_overall_summary incremental (i.e. O(size_of_inlined_function)
> > > > and not O(size_of_inlined_function+size_of_function_inlined_to)) that will
> > > > also solve these issues.
> > > 
> > > Hopefully.
> > > 
> > > > One can probably do a pesimistic estimate on code size of the function 
> > > > (i.e. add the size of inlined function) and only if that hits the large 
> > > > function growth do the exact calculation. Or we can just decide that it 
> > > > is OK to ignore large function growht in this partiuclar case.
> > > 
> > > Ignoring it is probably not a good idea and will just lead to a different
> > > degenerate case.  As we update the summary afterwards it's probably
> > > ok to do incremental (pessimistic) updates on the size anyway?  In
> > > inline_merge_summary I mean?  Or should I open-code that into
> > > inline_call if ! update_overall_summary?
> > 
> > So like below - I update self-size by the growth estimate which is
> > supposed to match, this should keep the overall function growth
> > limits working, not sure about the stack stuff but that doesn't seem
> > to be updated by update_overall_summaries either.
> 
> Not really, the sizes are also computed in fixed point math, because we 
> estimate probabilities when insn is going to be removed. There are some 
> cases wher estimate_growth is not as precise as the real thing, you can 
> see the #if 0 out asstert in ipa-inline-transform abou tthat. Well, for 
> inline functions called once, I suppose precision of these details is 
> not as important, so we may just go with the patch. I will finish the 
> sreal conversion next stage1 (I got stuck on this patch series on 
> gengtype change to allow sreal inline_summary which I do not think ever 
> got reviewed) and do the incremental update. So if you fell this PR is 
> important, go ahead with the patch.

I installed it on trunk.  It's a goal to have no quadratic algorithms
in -O1 (yeah, out-of-SSA and RA are some known left-overs here).

Richard.

> Honza
> > 
> > This also fixes a similar issue in early inlining where we can then
> > trivially delay update_overall_summaries.  I remember seeing early
> > inline analysis behave similarly quadratic.
> > 
> > It leaves alone recursive and IPA small function inlining as those
> > update overall unit size as well - I first thought about somehow
> > making overall summary update lazy, but that looks quite complicated
> > and IIRC the round-off errors issue was when re-computing the
> > key for the fibheap after doing one inlining, right?
> > 
> > I hope you can get to use sreals early in next stage1 though I wonder
> > how that will avoid all corner-cases.
> > 
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.  Ok for trunk
> > and branch?
> > 
> > Thanks,
> > Richard.
> > 
> > 2016-02-18  Richard Biener  <rguenther@suse.de>
> > 
> > 	PR ipa/37448
> > 	* ipa-inline-transform.c (inline_call): When not updating
> > 	overall summaries adjust self size by the growth estimate.
> > 	* ipa-inline.c (inline_to_all_callers_1): Add to the callers
> > 	hash-set, do not update overall summaries here.  Renamed from ...
> > 	(inline_to_all_callers): ... this which is now wrapping the
> > 	above and performing delayed overall summary update.
> > 	(early_inline_small_functions): Delay updating of the overall
> > 	summary.
> > 
> > Index: gcc/ipa-inline-transform.c
> > ===================================================================
> > --- gcc/ipa-inline-transform.c	(revision 233498)
> > +++ gcc/ipa-inline-transform.c	(working copy)
> > @@ -300,10 +300,12 @@ inline_call (struct cgraph_edge *e, bool
> >    struct cgraph_edge *curr = e;
> >    struct cgraph_node *callee = e->callee->ultimate_alias_target ();
> >    bool new_edges_found = false;
> > +  int estimated_growth;
> >  
> > +  if (! update_overall_summary)
> > +    estimated_growth = estimate_edge_growth (e);
> >    /* This is used only for assert bellow.  */
> >  #if 0
> > -  int estimated_growth = estimate_edge_growth (e);
> >    bool predicated = inline_edge_summary (e)->predicate != NULL;
> >  #endif
> >  
> > @@ -373,7 +375,13 @@ inline_call (struct cgraph_edge *e, bool
> >      new_edges_found = ipa_propagate_indirect_call_infos (curr, new_edges);
> >    check_speculations (e->callee);
> >    if (update_overall_summary)
> > -   inline_update_overall_summary (to);
> > +    inline_update_overall_summary (to);
> > +  else
> > +    /* Update self size by the estimate so overall function growth limits
> > +       work for further inlining into this function.  Before inlining
> > +       the function we inlined to again we expect the caller to update
> > +       the overall summary.  */
> > +    inline_summaries->get (to)->size += estimated_growth;
> >    new_size = inline_summaries->get (to)->size;
> >  
> >    if (callee->calls_comdat_local)
> > Index: gcc/ipa-inline.c
> > ===================================================================
> > --- gcc/ipa-inline.c	(revision 233498)
> > +++ gcc/ipa-inline.c	(working copy)
> > @@ -2163,7 +2163,8 @@ flatten_function (struct cgraph_node *no
> >     recursion.  */
> >  
> >  static bool
> > -inline_to_all_callers (struct cgraph_node *node, void *data)
> > +inline_to_all_callers_1 (struct cgraph_node *node, void *data,
> > +			 hash_set<cgraph_node *> *callers)
> >  {
> >    int *num_calls = (int *)data;
> >    bool callee_removed = false;
> > @@ -2193,7 +2194,10 @@ inline_to_all_callers (struct cgraph_nod
> >  		   inline_summaries->get (node->callers->caller)->size);
> >  	}
> >  
> > -      inline_call (node->callers, true, NULL, NULL, true, &callee_removed);
> > +      /* Remember which callers we inlined to, delaying updating the
> > +	 overall summary.  */
> > +      callers->add (node->callers->caller);
> > +      inline_call (node->callers, true, NULL, NULL, false, &callee_removed);
> >        if (dump_file)
> >  	fprintf (dump_file,
> >  		 " Inlined into %s which now has %i size\n",
> > @@ -2211,6 +2215,23 @@ inline_to_all_callers (struct cgraph_nod
> >    return false;
> >  }
> >  
> > +/* Wrapper around inline_to_all_callers_1 doing delayed overall summary
> > +   update.  */
> > +
> > +static bool
> > +inline_to_all_callers (struct cgraph_node *node, void *data)
> > +{
> > +  hash_set<cgraph_node *> callers;
> > +  bool res = inline_to_all_callers_1 (node, data, &callers);
> > +  /* Perform the delayed update of the overall summary of all callers
> > +     processed.  This avoids quadratic behavior in the cases where
> > +     we have a lot of calls to the same function.  */
> > +  for (hash_set<cgraph_node *>::iterator i = callers.begin ();
> > +       i != callers.end (); ++i)
> > +    inline_update_overall_summary (*i);
> > +  return res;
> > +}
> > +
> >  /* Output overall time estimate.  */
> >  static void
> >  dump_overall_stats (void)
> > @@ -2590,10 +2611,13 @@ early_inline_small_functions (struct cgr
> >  	fprintf (dump_file, " Inlining %s into %s.\n",
> >  		 xstrdup_for_dump (callee->name ()),
> >  		 xstrdup_for_dump (e->caller->name ()));
> > -      inline_call (e, true, NULL, NULL, true);
> > +      inline_call (e, true, NULL, NULL, false);
> >        inlined = true;
> >      }
> >  
> > +  if (inlined)
> > +    inline_update_overall_summary (node);
> > +
> >    return inlined;
> >  }
> >  
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]