This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: API for callgraph and IPA passes for whole program optimization


On 2/17/08 11:31 AM, Jan Hubicka wrote:

The plan would be to update passmanager first and transit main IPA passes
(inliner, constant propagation, alias analysis) on mainline.  The more advanced
IPA stuff, as struct reorg can go later since it is not a showstopper for first
incarnation of whole program optimizer anyway. On LTO branch we can do changes
related to memory management and overall pass queue organization.

Sounds reasonable.


 /* For IPA pass only. Analyze the given function body and produce summary
    information.  */
 void analyze_function (struct cgraph_node *);
 /* For IPA pass only. Analyze the given variable initializer and produce
    summary information.  */
 void analyze_variable (struct varpool_node *);
 /* For IPA pass only. Read summary from disk.  */
 void read_function_local_info (struct cgraph_node *, whatever parameters needed
 		                by LTO implementation);
 void read_variable_local_info (struct varpool_node *, whatever parameters needed
 		              by LTO implementation);
 /* For IPA pass only. Apply changes to given function body.  */
 void modify_function (struct cgraph_node *);
 /* For IPA pass only. Apply changes to given variable body.  */
 void modify_body (struct varpool_node *);
 /* For IPA pass only. Write summary to disk.  */
 void write_function_local_info (struct cgraph_node *, whatever parameters
 				 needed by LTO implementation);
 void write_variable_local_info (struct varpool_node *, whatever parameters
 				 needed by LTO implementation);

I find 'analyze' for the first stage confusing. We do no analysis there, we just produce summary info. The analysis is actually done by what you call 'read'. How about some variant of:

generate_summary_{function/variable}
analyze_{function/variable}
transform_{function/variable}

?

For implementing the stage C by whopr document (ie be able to produce
.o files with decisions from global optimization in them), we would also need two extra hook for reading and writing
ipa_optimization_info, but I would leave this out.

Note that besides these hooks we will also need the central driver for whole-program analysis. My thinking is that this driver will be part of the IPA manager itself. We may also want to write the optimization plan to a central file, instead of replicating it on every .o file.



I would propose doing this change along with killing RTL dump letter fields, since most annoying change of this is actually updating all the initializers of all GCC passes by hand.

   BTW what about instead of adding 8 NULL fields to each initializer adding a
   simple macro, like
     IPA_PASS (analyze_fun, analyze_var, write_fun, write_var, read_fun,
               read_var, execute, modify_fun, modify_var)
     LOCAL_PASS (execute_fun)
     RTL_PASS (execute_fun)
   macros so we don't have to go over it again?

Sounds good.


I would be happy to do the non-macroized change however.

With these extra hooks, passmanager queue could be organized as follows:
   all_lowering_passes: executed per function as done now.
   all_early_ipa_passes: queue consisting of IPA passes with only execute
     function set.  Here we will do things like early inlining, early
     optimization and similar passes.
   all_interunit_ipa_passes: IPA passes with analyze/execute/modify pair.
     Pass manager will execute them after early_ipa_passes and will call
     all analyze hooks first.  Possibly followed by write hooks and
     exit, or with execute next and modify hooks last based on the fact if we
     do LTO or not.
   all_late_ipa_passes: If we opt for having late small IPA optimizer, we can
     put passes here.  Probably not in initial implementation.
   all_passes: Local optimization passes as we do now executed on topological
   order. This can be subpass of last pass of all_interunit_ipa_passes too.

Yes.


With LTO linktime optimization the queue will start with
all_interunit_ipa_passes with read hooks followed by execute and modify hooks.

Hmm, well. This could even be on two or three separate compilation passes. The first pass calls all the 'generate' hooks (this can be done via make -jN with all the initial .c files), a second pass calls all the analysis hooks (this is done by a single GCC invocation) and the third pass (also done via make -jN) calls all the modify hooks.


We could structure things so that:

$ gcc -flto -O2 *.c

does everything in one invocation. But I would also like to support the model where we operate in separate phases.


So the plan is to turn IPCP and other passes from doing real cloning into
same virtual cloning.

Sounds good.


Thats about it.  I would welcome all comments and if it seems useful I can
turn it into wiki page adding details to the current implementation plan at wiki.

Thanks for the detailed plan. Yes, please add it to the whopr wiki. The only aspects that are not too clear to me are what exactly do you plan to do in mainline.


One idea would be to do all the basic framework during stage 1 and leave it in mainline. I would suggest doing as much as possible in mainline, so that it's then pulled in by the LTO branch.

Kenny, what do you expect we could pull out from the LTO branch for stage 1? Does it make sense to open a new branch inheriting from LTO for this work?


Thanks. Diego.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]