This is the mail archive of the
mailing list for the GCC project.
RE: Function outlining and partial Inlining
- From: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Mon, 16 Feb 2015 13:17:22 +0000
- Subject: RE: Function outlining and partial Inlining
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=permerror (sender IP is 220.127.116.11) smtp dot mailfrom=ajit dot kumar dot agarwal at xilinx dot com;
- References: <624d68963e4f4768bbba8bcb2c3f8928 at BL2FFO11FD009 dot protection dot gbl> <20150212170419 dot GB3301 at kam dot mff dot cuni dot cz>
From: Jan Hubicka [mailto:firstname.lastname@example.org]
Sent: Thursday, February 12, 2015 10:34 PM
To: Ajit Kumar Agarwal
Cc: email@example.com; firstname.lastname@example.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Function outlining and partial Inlining
> Hello All:
> The large functions are the important part of high performance
> application. They contribute to performance bottleneck with many
> respect. Some of the large hot functions are frequently executed but many regions inside the functions are cold regions. The large Function blocks the function inlining to happen before of the code size constraints.
> Such cold regions inside the hot large functions can be extracted out
> and form the function outlining. Thus breaking the large functions Into smaller function segments which causes the functions to be inlined at the caller site or helps in partial inlining.
> LLVM Compiler has the functionality and the optimizations for function
> outlining based on regions like basic blocks, superblocks and
> Hyperblocks which gets extracted out into smaller function segments and thus enabling the partial inlining and function inlining to happen At the caller site.
> This optimization is the good case of profile guided optimizations and based on the profile feedback data by the Compiler.
> Without profile information the above function outlining optimizations will not be useful.
> We are doing lot of optimization regarding polymorphism and also the
> indirect icall promotion based on the profile feedback on the Callgraph profile.
> Are we doing the function outlining optimization in GCC with respect
> to function inline and partial inline based on profile feedback Data.
> If not this optimization can be implemented. If already implemented in GCC Can I know any pointer for such code in GCC and the Scope of this function outlining optimization.
>>The outlining pass is called ipa-split. The heuristic used is however quite simplistic and it looks for very specific case where you have small header of a function containing conditional and >>splits after that. It does use profile.
Thanks! I have gone through the code in the file ipa-split. This has the infrastructure of doing function outlining with respect to early exits routine.
The early exits from the given function leads to more cold regions based on the profile data. The profile data with respect to early return statement and
the regions that it corresponds for early exits gives the rest of the regions as Cold regions. The early exits leads to Single Entry and Multiple Exit regions
where the chances of code regions with respect to the rest of regions with early exits can be formed as cold regions.
>>Any work on improving the heuristics or providing interesting testcases to consider would be welcome.
There are many large application that forms the Single Entry and Multiple Exits. I can see the h264ref benchmarks in Spec2006 which has many early
Exits routine like SubPelBlockMotionSearch which are the candidates of Single Entry and Multiple Exits with break statements instead of returns.
I don't have other examples of early exits returns but there could be many examples which has early returns. These are mainly with respect to IF-THEN-ELSE
>>I think LLVM pass is doing pretty much the same analysis minus the profile feedback considerations. After splitting, LLVm will inline the header into all callers while GCC leaves this on the >>decision of inliner heuristics that may just merge the function back into one block.
>>The actual outlining logic is contained in tree-inline.c and also used by OpenMP.
Thanks & Regards
> If not implemented , Can I propose to have the optimization like function outlining in GCC.
> Thoughts Please?
> Thanks & Regards