This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Jump Bypass Optimization


> 
> I apologise for being so cryptic to Jan in my recent posting to
> gcc-patches about a new GCSE-based optimization that I'm currently
> working on.  Given that the projects/contributing pages recommend
> informing other MAINTAINERS of projects under development, I thought
> I'd describe my work here and clarify some of my earlier comments.
> 
> The issue that I'm trying to address is the suggestion from the
> old PROJECTS file "Optimize a sequence of if statements whose
> conditions are exclusive".
> 
> To quote the GCC projects page, it is possible to optimize
> 
> if (x == 1) ...;
> if (x == 2) ...;
> if (x == 3) ...;
> 
> into
> 
> if (x == 1) ...;
> else if (x == 2) ...;
> else if (x == 3) ...;
> 
> provide that x is not altered by the contents of the if statements.

Yes.  Partly this can be viewed as slightly generalized jump threading
(cfg-branch already contains patch to jump threading to handle this special
case properly), but using global optimizer to do would be cleaner and cheaper.
It can be also seen as supproblem of value range propagation, but I have
no idea in what stage the patch is.

cfg-branch is also able to handle this partially using superblock formation.
> 
> 
> The strategy I'm trying to use to acccomplish this has two parts:
> "jump bypassing" and "GCSE pseudo assignment", each described below.
> 
> 
> [1] JUMP BYPASSING
> The current constrant propagation pass of GCC is able to convert
> conditional jumps into unconditional jumps using the available
> expressions calculated using standard PRE.  With this methodology
> a conditional jump is optimized if it depends upon a value that
> has a known value at the point the jump is evaluated.
> 
> Consider the common case of a basic block that starts with a
> conditional jump, that has multiple incoming edges.  The current
> cprop transformation only applies if the value is known and has
> the same value on all incoming branches.  The "jump bypass"
> optimization considers the situation where the value is known
> over some subset of the incoming edges.  If the condition is
> known to be satisfied the edge is redirected to the appropriate
> target.

Hmm, how does this compare to spartial constant propagation?
Would this be able to handle other cases like
if (x>5)
  code
if (x<5)
  other code
> 
> Example:
> 
> 	if (x == 1)
> 	  {
> 	    a ();
> 	    y = 2;
> 	  }
> 	if (y == 2)
> 	  b ();
> 
> In this case, following the assignment "y = 2", GCC generates a
> branch or fallthrough (edge) to the y == 2 test.  This can't be
> eliminated by classic PRE cprop as y is not known to be 2 with
> the edge coming from "x != 1".
> 
> This optimization also applies to things like
> 
> 	while (!done)
> 	  {
> 	    if (cond)
> 	      foo ();
> 	    else
> 	      done = true;
> 	  }
> 
> where immediately after "done = true" the edge can be redirected out
> of the loop.  Hopefully, if "done" isn't used elsewhere live analysis
> will clean up the assignment, effectively turning it into a break.
> 
> A benchmark obviously made for this optimization is the following code
> taken from the classic Whetstone benchmark.
> 
> 	if (j == 1) j = 2;
> 	else        j = 3;
> 	if (j > 2)  j = 0;
> 	else        j = 1;
> 	if (j < 1)  j = 1;
> 	else	    j = 0;
> 
> which jump bypassing (and dead code elimination) can transform into
> 
> 	if (j == 1) j = 0;
> 	else        j = 1;
> 
> [Jan, this is the well-known benchmark that I was refering to in
> my previous email.  Unfortunately, my patch can only apply this
> transformation when "-fno-if-conversion" is specified, as ifcvt
> applies before GCSE turning the whetstone loop into inpenetrable
> straight line code.]

Hmm, why we can't CSE that code than into equivalent sequence?
I will check.
> 
> 
> [2] GCSE PSEUDO-ASSIGNMENT
> Common subexpression elimination is handled by two separate passes
> within GCC, CSE and GCSE.  In theory, the constant propagation
> transformations available to both should be the same, but with
> different scopes.  CSE should clean-up basic blocks and extended
> basic-blocks, whilst GCSE cprop should apply interblock.
> 
> One notable difference is the "pseudo-assignments" created by
> conditional branches.  In the code "if (x == 2) A; else B;"
> CSE is able to assert that x has the value 2 when A is executed, the
> condition effectively behaving as an assignment down one edge.
> 
> Unfortunately, these "pseudo-assignments" aren't currently handled
> by PRE.
> 
> Consider the following example:
> 
> extern int z1, z2;
> 
> void foo(int x, int y)
> {
>   if (x == 2)
>     {
>       z1 = x*384;
>       if (y == 1)
> 	a ();
>       else
> 	b ();
>       z2 = x*913;
>     }
> }
> 
> In this example, the assignment to z1 is optimized by CSE as x is
> known to be 2 at this point.  However GCSE doesn't have this information
> so the assignment to z2 isn't currently optimized.
> 
> It should be simple enough to modify the PRE data flow equations to
> allow a virtual "gen" on each edge.  Where I'd like the advice of GCC
> optimization experts is the best way to make the "assignment" visible
> in gcse.c.  Either a large part of the code in CSE could be shared or
> a cleaner but more tricky solution is for CSE to insert suitable
> REG_EQUIV notes into the code stream, that could be used by GCSE.

I believe teaching gcse analyzing code to derrive this itself would be easy
enought.  But what GCSE desperately needs is to make it easier to understand.
I would love to see the gcse.c split into multiple files for all 3
optimizations it does - GCSE/PRE/LM/ code hoisting/store motion and having some
common set of routines, like globalopt for  handling the hash tables and
friends.

I've always attributed my inability to change something in GCSE to the
fact that I understand little to global optimizers, but I read some papers
and books and still the code looks little bit too complex too me.
I was trying to modify it for my code hoisting infrastructure and even that
was quite dificlt.
I will try to push this patch out as one of next patches, even when it is
not quite perfect at the moment.

Honza
> 
> 
> 
> Hopefully, the attentive reader should see that the combination of
> these two optimizations would effectively implement if-then-else
> conversion as motivated at the beginning of the e-mail.  The added
> bonus is that these transformations also optimize numerous other
> pieces of code.
> 
> I only began to appreciate the potential of these transformations
> when I tried an earlier version of my patch on the following code.
> 
> 	if (len == 3 && !strncmp(ptr,"foo",3)) a ();
>         else if(len == 4 && !strncmp(ptr,"barf",4)) b ();
> 	else if(len == 3 && !strncmp(ptr,"bar",3)) c ();
> 
> To my great surprise and delight the failure edge from the comparison
> against "foo" jumped directly to the strncmp call for "bar".  For
> those who've spent too much time reading GCC source code, you'll
> appreciate the potential impact on the huge if-then-else chains with
> numerous repeated conditionals that GCC contains (c.f. fold()).
> 
> 
> With any luck the above exposition should help expedite the review
> process of the sequence of patches I hope to submit over the next
> few months.  I'd also appreciate the advice law, rth, jan, dan and
> others may have to offer.
> 
> Roger
> --
> Roger Sayle,                         E-mail: roger@eyesopen.com
> OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
> Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
> Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]