[middle-end, patch 0/9] Ipa-prop overhaul and indirect inlining (PR 9079)

Martin Jambor mjambor@suse.cz
Fri Jul 4 21:21:00 GMT 2008


Hi,

I am sending a patch  set that reorganize structures and functionality
in  ipa-prop  which  is  currently  being  used  by  ipa-cp  and  also
implements inlining of indirect calls  (PR 9079) while sharing most of
intra-procedural analysis  (summary building) with  ipa-cp.  There are
nine  of these  patches, all  of them  have brief  summaries  in their
individual mails, but they could be roughly divided into the following
groups:

  1 - Add cgraph hooks for important events
  2 - Move ipa-prop structures to on-the-side arrays

These  two reorganize  how data  structures  used by  in ipa-prop  are
stored.   Specifically,  it moves  them  into  on-the-side arrays  and
manages them through  a number of call graph  action hooks.  I believe
Kenneth  Zadeck is very  much interested  in getting  this in  as this
sort-of-thing is basically required for ipa-cp LTO.

  3 - Make ipa-cp analyze only required nodes
  4 - Overhaul of modification analysis
  5 - Rewritten jump functions computation

The three patches change  how various intra-procedural parts of ipa-cp
work.   The  latter two  are  more-or-less  complete  rewrites of  the
respective   parts  and  are   now  cleaner   and  also   provide  new
functionality required for inlining calls by member pointers.

  6 - Add compiler a option to turn on indirect inlining
  7 - Create ipa-prop structures when performing indirect inlining
  8 - Formal parameter use analysis
  9 - Inlining of indirect calls

The  last four  patches implement  the new  functionality  of inlining
indirect calls  if the  target becomes known  as a result  of previous
inlining.  The division into four  patches is not necessary but should
help understanding the changes.


  Testing results
  ---------------

Recently, I have not been testing the patches individually but only as
a whole.  I'm running the last  bootstrap and testing as I write but I
do not expect  the result to be any different  from the previous ones.
That  means that  this patch  introduces one  new  mudflap regression.
However, I  am confident that  is a problem  in mudflap and not  in my
code.   The mudflap  violation is  triggered when  accessing  table of
virtual methods of cout (c++  standard output stream) and it goes away
when I add -heur-proc-map to its heuristics.

The test that fails is pass41-frag.cxx when compiled with -O3.  I have
not  yet found  out how  to  xfail it  (and I  do welcome  suggestions
regarding this matter) but I guess that is the right thing to do.


  Effects of indirect inlining
  ----------------------------

All this change does is that it adds new call graph edges which can be
later on  picked up by the  same inlining decision making  that we use
now.  In theory, the inliner  can thus only make better decisions than
it  does now.   I  have  run a  few  benchmarks but  so  far not  very
thoroughly so I don't feel like posting all of them here yet.  

The patch  set had the  biggest effect on libstdc++  performance tests
where there were  122 discovered indirect calls with  know targets and
122 of  them were inlined  (these may not  necessarily be the  same as
inlining can  duplicate these  edges, during regression  testing there
are more  inlined indirect edges than discovered  ones).  The run-time
data I  got for libstdc++  were a bit  noisy. Three of  the benchmarks
showed speedup of 13-22% speedup that could not be attributed to noise
and there  was a  few of others  like this with  performance increases
just below 10%.  On the other hand there is also one 12% slowdown that
also looks  like not  being caused by  any noise.  However,  all these
numbers are preliminary, I will have to redo the measurements.

When compiling DLV there are 299 discovered indirect edges and 299 are
inlined but there  is no apparent effect on  runtime.  13 instances of
both occur on Tramp3D but that has no effect on runtime either.

I have  not measured  effects on compile  time and  memory consumption
yet.

Perhaps I should reiterate that this feature was requested in PR 9079.

  Missing parts
  -------------

Test cases and documentation are not bundled in this patch set because
they are not ready yet.  Obviously  I promise to provide them before I
eventually commit this.  However, for  various reasons I would like to
trigger some review now.

I  have also  just found  out that  I still  do not  manage  to inline
examples  from  PR 3713  (because  the constant  is  not  passed as  a
parameter) but extending this to  cover these cases should not be very
difficult.

I will start working on tuplifying the whole thing early next week.

Thank you very much for any comments,

Martin



More information about the Gcc-patches mailing list