This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Jump Bypass Optimization

I apologise for being so cryptic to Jan in my recent posting to
gcc-patches about a new GCSE-based optimization that I'm currently
working on.  Given that the projects/contributing pages recommend
informing other MAINTAINERS of projects under development, I thought
I'd describe my work here and clarify some of my earlier comments.

The issue that I'm trying to address is the suggestion from the
old PROJECTS file "Optimize a sequence of if statements whose
conditions are exclusive".

To quote the GCC projects page, it is possible to optimize

if (x == 1) ...;
if (x == 2) ...;
if (x == 3) ...;


if (x == 1) ...;
else if (x == 2) ...;
else if (x == 3) ...;

provide that x is not altered by the contents of the if statements.

The strategy I'm trying to use to acccomplish this has two parts:
"jump bypassing" and "GCSE pseudo assignment", each described below.

The current constrant propagation pass of GCC is able to convert
conditional jumps into unconditional jumps using the available
expressions calculated using standard PRE.  With this methodology
a conditional jump is optimized if it depends upon a value that
has a known value at the point the jump is evaluated.

Consider the common case of a basic block that starts with a
conditional jump, that has multiple incoming edges.  The current
cprop transformation only applies if the value is known and has
the same value on all incoming branches.  The "jump bypass"
optimization considers the situation where the value is known
over some subset of the incoming edges.  If the condition is
known to be satisfied the edge is redirected to the appropriate


	if (x == 1)
	    a ();
	    y = 2;
	if (y == 2)
	  b ();

In this case, following the assignment "y = 2", GCC generates a
branch or fallthrough (edge) to the y == 2 test.  This can't be
eliminated by classic PRE cprop as y is not known to be 2 with
the edge coming from "x != 1".

This optimization also applies to things like

	while (!done)
	    if (cond)
	      foo ();
	      done = true;

where immediately after "done = true" the edge can be redirected out
of the loop.  Hopefully, if "done" isn't used elsewhere live analysis
will clean up the assignment, effectively turning it into a break.

A benchmark obviously made for this optimization is the following code
taken from the classic Whetstone benchmark.

	if (j == 1) j = 2;
	else        j = 3;
	if (j > 2)  j = 0;
	else        j = 1;
	if (j < 1)  j = 1;
	else	    j = 0;

which jump bypassing (and dead code elimination) can transform into

	if (j == 1) j = 0;
	else        j = 1;

[Jan, this is the well-known benchmark that I was refering to in
my previous email.  Unfortunately, my patch can only apply this
transformation when "-fno-if-conversion" is specified, as ifcvt
applies before GCSE turning the whetstone loop into inpenetrable
straight line code.]

Common subexpression elimination is handled by two separate passes
within GCC, CSE and GCSE.  In theory, the constant propagation
transformations available to both should be the same, but with
different scopes.  CSE should clean-up basic blocks and extended
basic-blocks, whilst GCSE cprop should apply interblock.

One notable difference is the "pseudo-assignments" created by
conditional branches.  In the code "if (x == 2) A; else B;"
CSE is able to assert that x has the value 2 when A is executed, the
condition effectively behaving as an assignment down one edge.

Unfortunately, these "pseudo-assignments" aren't currently handled
by PRE.

Consider the following example:

extern int z1, z2;

void foo(int x, int y)
  if (x == 2)
      z1 = x*384;
      if (y == 1)
	a ();
	b ();
      z2 = x*913;

In this example, the assignment to z1 is optimized by CSE as x is
known to be 2 at this point.  However GCSE doesn't have this information
so the assignment to z2 isn't currently optimized.

It should be simple enough to modify the PRE data flow equations to
allow a virtual "gen" on each edge.  Where I'd like the advice of GCC
optimization experts is the best way to make the "assignment" visible
in gcse.c.  Either a large part of the code in CSE could be shared or
a cleaner but more tricky solution is for CSE to insert suitable
REG_EQUIV notes into the code stream, that could be used by GCSE.

Hopefully, the attentive reader should see that the combination of
these two optimizations would effectively implement if-then-else
conversion as motivated at the beginning of the e-mail.  The added
bonus is that these transformations also optimize numerous other
pieces of code.

I only began to appreciate the potential of these transformations
when I tried an earlier version of my patch on the following code.

	if (len == 3 && !strncmp(ptr,"foo",3)) a ();
        else if(len == 4 && !strncmp(ptr,"barf",4)) b ();
	else if(len == 3 && !strncmp(ptr,"bar",3)) c ();

To my great surprise and delight the failure edge from the comparison
against "foo" jumped directly to the strncmp call for "bar".  For
those who've spent too much time reading GCC source code, you'll
appreciate the potential impact on the huge if-then-else chains with
numerous repeated conditionals that GCC contains (c.f. fold()).

With any luck the above exposition should help expedite the review
process of the sequence of patches I hope to submit over the next
few months.  I'd also appreciate the advice law, rth, jan, dan and
others may have to offer.

Roger Sayle,                         E-mail:
OpenEye Scientific Software,         WWW:
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]