This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Tree SSA If-combine optimization pass in GCC
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Wed, 18 Feb 2015 10:58:10 +0100
- Subject: Re: Tree SSA If-combine optimization pass in GCC
- Authentication-results: sourceware.org; auth=none
- References: <020b7d9d890e42ddafb2a22de8291f45 at BN1BFFO11FD045 dot protection dot gbl> <CAFiYyc0qD=8cbaVLOnQ=YoVo6fK1ZwBV8tApuG0xK19f3fpxTw at mail dot gmail dot com> <44456938198d427abd8906455a514e07 at BN1BFFO11FD047 dot protection dot gbl> <CAFiYyc3v++TLRxMTJQjq7-R_8LuVp0NHLBYHbdJ9MC54EQGAKQ at mail dot gmail dot com> <c309342e8a2449e08378f790721cb843 at BN1AFFO11FD037 dot protection dot gbl>
On Tue, Feb 17, 2015 at 5:24 PM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Tuesday, February 17, 2015 5:49 PM
> To: Ajit Kumar Agarwal
> Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: Tree SSA If-combine optimization pass in GCC
>
> On Tue, Feb 17, 2015 at 11:26 AM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>>
>>
>> -----Original Message-----
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Tuesday, February 17, 2015 3:42 PM
>> To: Ajit Kumar Agarwal
>> Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
>> Hunsigida; Nagaraju Mekala
>> Subject: Re: Tree SSA If-combine optimization pass in GCC
>>
>> On Tue, Feb 17, 2015 at 9:22 AM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>>> Hello All:
>>>
>>> I can see the IF-combining (If-merging) pass of optimization on tree-ssa form of intermediate representation.
>>> The IF-combine or merging takes of merging the IF-THEN-ELSE if the
>>> condition Expr found be congruent or Similar.
>>>
>>> The IF-combine happens if the two IF-THEN-ELSE are contiguous to each other.
>>> If the IF-THEN-ELSE happens to be not contiguous but are wide apart with there is code in between.
>>> Does the If-combine takes care of this. This requires to do the
>>> head-duplication and Tail-duplication for the Code in between If-THEN-ELSE to bring the IF-THEN-ELSE contiguous to each other.
>>>
>>> After the head and tail duplication of the code in between the
>>> IF-THEN-ElSE sequence becomes contiguous to each other. Apart from
>>> this, Does the tree-ssa-if-combine pass considers the control flow of the body of the IF-THEN-ELSE. Is there any limitation on control flow of the body of the IF-THEN-ELSE.
>>>
>>> Can I know the scope of tree-ssa-ifcombine optimizations pass with respect to the above points.
>>>
>>> Thoughts Please?
>>
>>>>if-combine is a simple CFG + condition pattern matcher. It does not perform head/tail duplication. Also there is no "control flow" in the bodies, control flow is part of the CFG that is >>matched so I'm not quite getting your last question.
>>
>> Thanks ! My last question was If there is a control flow likes loops
>> inside the IF-THEN-ELSE, which could be possible if the Loop
>> unswitching is performed and the Loop body is placed inside the IF-THEN-ELSE, then in that case the two IF-THEN-ELSE can be merged if the cond expr matches and the control flow of the body of If-then-else matches?
>>
>> There are many cases in SPEC 2006 benchmarks where the IF-combine
>> could be enabled if the if-then-else sequence is made contiguous by performing the head/tail duplication.
>
>>>I'd be curious what those cases look like. Care to file some bugreports with testcases?
>
> This is not a bug and itâs the performance improvement optimizations with respect to h264ref spec2006 benchmarks. Here is the example.
>
>
> Var1 = funcptr();
> For(...)
> {
> Code here ....
> For(...)
> {
> Code here ...
> For(...)
> ... code here..
>
> If(*funcptr() == FastPely())
> FastPely(....)
> Else
> (*funcptr)();
>
> There are such 16 IF statements.
>
>
> .... code here
>
> } end for
> Code here
> }//end for
> Code here
> }//end for.
>
> The funcptr has two targets FastPely() and UMVPely(). After the indirect call promotion the targets is known to be either Fastpely() or UMVPely.
>
> The Transformed code after indirect icall promotion looks like as follows.
>
> Var1 = funcptr();
> For(...)
> {
> Code here ....
> For(...)
> {
> Code here ...
> For(...)
> ... code here..
>
> If(var1 == FastPely())
> FastPely(....)
> Else
> UMVpely();
>
> There are such 16 IF statements.
So literally adjacent
if (var1 == FastPely)
FastPely();
else
UMVpely();
if (var1 == FastPely)
FastPely();
else
UMVpely();
...
? Like if from a manually unrolled loop? It would be interesting to
get a re-rolling facility in GCC. So you'd transform the above to
if (var1 == FastPely)
{
FastPely();
FastPely();
...
}
else
{
UMVpely();
UMVpely();
....
}
? Note that jump threading should perform this kind of optimization
already. It would be interesting to have a real testcase that can be
compiled - please open an enhancement bugreport.
Richard.
>
>
> .... code here
>
> } end for
> Code here
> }//end for
> Code here
> }//end for.
>
> After the icall promotion the Function FastPely or UMVPely can be inlined as the target is known to be either Fastpely() or UmvPely() and it become a candidate for heuristics for inlined.
> As you can see the transformed code the IF-THEN-ELSE (such 16 If statements) can be IF-combined and merged and then get inlined.
>
> Also you can see that the code above IF and below for which can be head duplicated or tail duplicated which is then become 3 -Level loop unswitching candidate. This can be loop unswitching candidate after the IF-Combine or merging.
>
> I am planning to implement the above optimizations in GCC with respect to h264ref spec 2006 benchmark. This gives a significant amount of gains.
> I have implemented the above optimization in Open64 compiler and it has given significant amount of gains in open64 compiler.
>
> Thanks & Regards
> Ajit
>
>>>>if-combine was designed to accompany IL-only patterns that get partly
>>>>translated into control flow. Like
>>
>> >>tem1 = name & bit1;
>> >>tem2 = name & bit2;
>> >>tem3 = tem1 | tem2;
>> >>if (tem3)
>> ...
>>
>>>>vs.
>>
>> >>tem1 = name & bit1;
>> >>if (tem1)
>> >>goto x;
>> >>else
>> >>{
>> >>tem2 = name & bit2;
>> >>if (tem2)
>> >> goto x;
>> >>}
>>
>>>>x:
>> >>...
>> Thanks for the examples. This explains the scope of if-combine optimization pass.
>>
>> Thanks & Regards
>> Ajit
>>
>> Richard.
>>
>>> Thanks & Regards
>>> Ajit