[google] Patch to support calling multi-versioned functions via new GCC builtin. (issue4440078)
Sriraman Tallam
tmsriram@google.com
Mon May 2 19:12:00 GMT 2011
Hi,
I want to submit this patch to google/main to make sure it is
available for our internal use at Google in order to materialize some
optimization opportunities. Let us continue this dicussion as I make
changes and submit this for review for trunk.
Thanks,
-Sri.
On Mon, May 2, 2011 at 9:41 AM, Xinliang David Li <davidxl@google.com> wrote:
> On Mon, May 2, 2011 at 2:11 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Fri, Apr 29, 2011 at 6:23 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> Here is the background for this feature:
>>>
>>> 1) People relies on function multi-version to explore hw features and
>>> squeeze performance, but there is no standard ways of doing so, either
>>> a) using indirect function calls with function pointers set at program
>>> initialization; b) using manual dispatch at each callsite; b) using
>>> features like IFUNC. The dispatch mechanism needs to be promoted to
>>> the language level and becomes the first class citizen;
>>
>> You are not doing that, you are inventing a new (crude) GCC extension.
>
> To capture the high level semantics and prevent user from lowering the
> dispatch calls into forms compiler can not recognize, language
> extension is the way to go.
>
>>
>
>>> See above. Multi-way dispatch can be added similarly.
>>
>> Not with the specified syntax. So you'd end up with _two_
>> language extensions. That's bad (and unacceptable, IMHO).
>
> This part can be improved.
>
>>
>>>
>>>>
>>>> That's a nice idea, but why not simply process all functions with
>>>> a const attribute and no arguments this way? IMHO
>>>>
>>>> int testsse (void) __attribute__((const));
>>>>
>>>> is as good and avoids the new attribute completely. The pass would
>>>> also benefit other uses of this idiom (giving a way to have global
>>>> dynamic initializers in C). The functions may not be strictly 'const'
>>>> in a way the compiler autodetects this attribute but it presents
>>>> exactly the promises to the compiler that allow this optimization.
>>>>
>>>> Thus, please split this piece out into a separate patch and use
>>>> the const attribute.
>>>
>>>
>>> Sounds good.
>>>
>
>>>
>>> 1) the function selection may happen in a different function;
>>
>> Well, sure. I propose to have the above as lowered form. If people
>> deliberately obfuscate code it's their fault. Your scheme simply
>> makes that obfuscation impossible (or less likely possible). So
>> 1) is not a valid argument.
>
> Lowering too early may defeat the purpose because
>
> 1) the desired optimization may not happen subject to many compiler
> heuristic changes;
> 2) it has other side effects such as wrong estimation of function size
> which may impact inlining
> 3) it limits the lowering into one form which may not be ideal --
> with builtin_dispatch, after hoisting optimization, the lowering can
> use more efficient IFUNC scheme, for instance.
>
>>
>>
>> My point is that such optimization is completely independent of
>> that dispatch thing. The above could as well be a selection to
>> use different input arrays, one causing alias analysis issues
>> and one not.
>>
>> Thus, a __builtin_dispatch centric optimization pass is the wrong
>> way to go.
>
> I agree that many things can implemented in different ways, but a high
> level standard builtin_dispatch mechanism doing hoisting
> interprocedcurally is cleaner and simpler and targeted for function
> multi-versioning -- it does not depend on/rely on later phase's
> heuristic tunings to make the right things to happen. Function MV
> deserves this level of treatment as it will become more and more
> important for some users (e.g., Google).
>
>>
>> Now, with FDO I'd expect the foo is inlined into bar (because foo
>> is deemed hot),
>
> That is a myth -- the truth is that there are other heuristics which
> can prevent this from happening.
>
>> then you only need to deal with loop unswitching,
>> which should be easy to drive from FDO.
>
> Same here -- the loop body may not be well formed/recognized. The loop
> nests may not be perfectly nested, or other unswitching heuristics may
> block it from happening. This is the common problem form many other
> things that get lowered too early. It is cleaner to make the high
> level transformation first in IPA, and let unswitching dealing with
> intra-procedural optimization.
>
> Thanks,
>
> David
>
>>
>> Richard.
>>
>>>
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>
>>>
>>>>
>>>> Note that we already have special FDO support for indirect to direct
>>>> call promotion, so that might work already.
>>>>
>>>> Thus, with all this, __builtin_dispatch would be more like syntactic
>>>> sugar to avoid writing above switch statement yourself. If you restrict
>>>> that sugar to a constant number of candidates it can be replaced by
>>>> a macro (or several ones, specialized for the number of candidates).
>>>>
>>>> Richard.
>>>>
>>>>> For the call graph that looks like this before cloning :
>>>>>
>>>>> func_cold ----> func_hot ----> findOnes ----> IsPopCntSupported ----> popcnt
>>>>> |
>>>>> -----> no_popcnt
>>>>>
>>>>> where popcnt and no_popcnt are the multi-versioned functions, then after cloning :
>>>>>
>>>>> func_cold ---->IsPopCntSupported ----> func_hot_clone0 ----> findOnes_clone0 ----> popcnt
>>>>> |
>>>>> ----> func_hot_clone1 ----> findOnes_clone1 ----> no_popcnt
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Flags : -fclone-hot-version-paths does function unswitching via cloning.
>>>>> --param=num-mversn-clones=<num> allows to specify the number of
>>>>> functions that should be cloned.
>>>>> --param=mversn-clone-depth=<num> allows to specify the length of
>>>>> the call graph path that should be cloned. num = 0 implies only
>>>>> leaf node that contains the __builtin_dispatch statement must be
>>>>> cloned.
>>>>> ============================================================================================
>>>>>
>>>>> Google ref 45947, 45986
>>>>>
>>>>> 2011-04-28 Sriraman Tallam <tmsriram@google.com>
>>>>>
>>>>> * c-family/c-common.c (revision 173122) (handle_version_selector_attribute): New function.
>>>>> (c_common_attribute_table): New attribute "version_selector".
>>>>> * tree-pass.h (revision 173122) (pass_tree_convert_builtin_dispatch): New pass.
>>>>> (pass_ipa_multiversion_dispatch): New pass.
>>>>> * testsuite/gcc.dg/mversn7.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn4.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn4.h (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn4a.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/torture/mversn1.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn2.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn6.c (revision 0): New test case.
>>>>> * testsuite/gcc.dg/mversn3.c (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn8.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn10a.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn14a.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/tree-prof/mversn13.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/tree-prof/mversn15.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/tree-prof/mversn15a.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn9.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn10.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn12.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn14.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/mversn16.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/torture/mversn11.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/torture/mversn5.C (revision 0): New test case.
>>>>> * testsuite/g++.dg/torture/mversn5.h (revision 0): New test case.
>>>>> * testsuite/g++.dg/torture/mversn5a.C (revision 0): New test case.
>>>>> * builtin-types.def (revision 173122) (BT_PTR_FN_INT): New pointer type.
>>>>> (BT_FN_INT_PTR_FN_INT_PTR_PTR_VAR): New function type for __builtin_dispatch.
>>>>> * builtins.def (revision 173122) (BUILT_IN_DISPATCH): New builtin to
>>>>> support multi-version calls.
>>>>> * mversn-dispatch.c (revision 0): New file.
>>>>> * timevar.def (revision 173122) (TV_MVERSN_DISPATCH): New time var.
>>>>> * common.opt (revision 173122) (fclone-hot-version-paths): New flag.
>>>>> * Makefile.in (revision 173122) (mversn-dispatch.o): New rule.
>>>>> * passes.c (revision 173122) (init_optimization_passes): Add the new
>>>>> multi-version and dispatch passes to the pass list.
>>>>> * params.def (revision 173122) (PARAM_NUMBER_OF_MVERSN_CLONES): Define.
>>>>> (PARAM_MVERSN_CLONE_CGRAPH_DEPTH): Define.
>>>>> * doc/invoke.texi (revision 173122) (mversn-clone-depth): Document.
>>>>> (num-mversn-clones): Document.
>>>>> (fclone-hot-version-paths): Document.
>>>>
>>>
>>
>
More information about the Gcc-patches
mailing list