User account creation filtered due to spam.

Bug 35603 - branch reordering with FDO
Summary: branch reordering with FDO
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.4.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2008-03-16 05:03 UTC by davidxl
Modified: 2008-03-17 06:36 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description davidxl 2008-03-16 05:03:28 UTC
Given code like:

if (C1)
{
B1:
   ..
}
else if (C2)
{
B2:
   ...
}

If edge profile indicates that B2 (Predicate: ^C1 AND C2 ) is hot but B1 (predicate: C1) is rarely executed, and if C1 and C2 are mutually exclusive so that the following holds:

C1 AND C2 = FALSE

==> ^C1 OR ^C2 = TRUE 
==> ^C1 AND C2 = C2

Then evaluation of C2 can be hoisted so that B2's control dependence height is reduced.  After branch sinking and dead branch elimination, we can get:

if (C2)
{
B2:
   ....
}
else if (C1)
{
B1:
   ...
}


//Example:

Compare the runtime with -DORIG and -DREORDER. Then build -DORIG version with profile data, it should have similar runtime ad -DREORDER.


int compute(int ) __attribute__((noinline));
#ifdef ORIG
int compute(int i)
{
     int result = 0;

     if (i == 10)
         result = 20;
     else if (i == 9)
         result = 30;
     else if ( i== 8)
         result = 40;
     else if (i ==  1)
          result = 200;

     else if (i == 2)
           result = 300;
     else if (i == 3)
           result = 400;


     return result;
}
#elif defined (REORDER)
int compute(int i)
{
    int result = 0;

     if (__builtin_expect(i,3) ==  3)
          result = 400;
     else if (i == 2)
           result = 300;
     else if (i == 1)
           result = 200;
     else if (i == 10)
         result = 20;
     else if (i == 9)
         result = 30;
     else if ( i== 8)
         result = 40;

     return result;
}

#endif
int main()
{

   int i;
   long long s = 0;
   for (i = 0; i < 1000000000; i++)
   {

        if (__builtin_expect((i%7777)==0,0))
             s+= compute(1);
        else if (__builtin_expect((i%6666) == 0,0))
             s += compute(2);
        else s += compute(3);
   }


   return (int)s;

}
Comment 1 davidxl 2008-03-16 05:06:47 UTC
(In reply to comment #0)
> Given code like:
> 
> if (C1)
> {
> B1:
>    ..
> }
> else if (C2)
> {
> B2:
>    ...
> }
> 
> If edge profile indicates that B2 (Predicate: ^C1 AND C2 ) is hot but B1
> (predicate: C1) is rarely executed, and if C1 and C2 are mutually exclusive so
> that the following holds:
> 
> C1 AND C2 = FALSE
> 
> ==> ^C1 OR ^C2 = TRUE 
> ==> ^C1 AND C2 = C2
> 
> Then evaluation of C2 can be hoisted so that B2's control dependence height is
> reduced.  After branch sinking and dead branch elimination, we can get:
> 
> if (C2)
> {
> B2:
>    ....
> }
> else if (C1)
> {
> B1:
>    ...
> }
> 
> 
> //Example:
> 
> Compare the runtime with -DORIG and -DREORDER. Then build -DORIG version with
> profile data, it should have similar runtime ad -DREORDER.
> 
> 
> int compute(int ) __attribute__((noinline));
> #ifdef ORIG
> int compute(int i)
> {
>      int result = 0;
> 
>      if (i == 10)
>          result = 20;
>      else if (i == 9)
>          result = 30;
>      else if ( i== 8)
>          result = 40;
>      else if (i ==  1)
>           result = 200;
> 
>      else if (i == 2)
>            result = 300;
>      else if (i == 3)
>            result = 400;
> 
> 
>      return result;
> }
> #elif defined (REORDER)
> int compute(int i)
> {
>     int result = 0;
> 
>      if (__builtin_expect(i,3) ==  3)
>           result = 400;
>      else if (i == 2)
>            result = 300;
>      else if (i == 1)
>            result = 200;
>      else if (i == 10)
>          result = 20;
>      else if (i == 9)
>          result = 30;
>      else if ( i== 8)
>          result = 40;
> 
>      return result;
> }
> 
> #endif
> int main()
> {
> 
>    int i;
>    long long s = 0;
>    for (i = 0; i < 1000000000; i++)
>    {
> 
>         if (__builtin_expect((i%7777)==0,0))
>              s+= compute(1);
>         else if (__builtin_expect((i%6666) == 0,0))
>              s += compute(2);
>         else s += compute(3);
>    }
> 
> 
>    return (int)s;
> 
> }
> 




This optimization should also be extended to reorder CAND, COR expressions

if (C1 || C2 || C3 || C3) <-- the condtion that is most likely true should be first for COR
{

}
Comment 2 Andrew Pinski 2008-03-16 05:08:05 UTC
Hmm, reorder basic blocks already takes into the edge probability.
Comment 3 davidxl 2008-03-17 06:36:58 UTC
(In reply to comment #2)
> Hmm, reorder basic blocks already takes into the edge probability.
> 

Yes, I checked that gcc is doing good job in code layout with FDO for better icache utilization.