[ast-optimizer-branch] Goto/Break/Continue/Switch elimination

Zack Weinberg zack@codesourcery.com
Fri May 3 10:21:00 GMT 2002


On Thu, May 02, 2002 at 08:59:43PM -0600, Roger Sayle wrote:
> 
> One possible solution to the potential performance pitfalls is to
> implement a "Break/Continue/Switch induction" pass.
> 
> This pass should hopefully transform SIMPLE back into a higher-level
> structured control flow.  These techniques are normally used to improve
> code readability in reverse engineering projects, for example making
> the AST pretty printer output much easier to comprehend.  For GCC, it
> would allow simplification of control flow for some optimizations,
> whilst reducing the size of a program's representation for others.

Yes, this sounds like a good plan to me.

> From a code generation perspective there are many reasons for keeping
> a switch statement intact all the way to the backend, including to make
> use of the hardware's "casei" and "tablejump" instructions.  The current
> scheme already performs switch statement decomposition too early, i.e.
> before the bounds have been modified by VRP.  Indeed, rather than
> decompose switches to perfectly balanced binary trees, a backend should
> skew the branches based upon the ratio of branch-taken/branch-not-taken
> timings including prediction hints.  [The current scheme of using
> sequential tests for switch statements with very few cases is really
> just a special case of tree skewing!]

And there are some other situations where special optimization of case
statements might be interesting.  For instance, when all of the
branches of a switch set the same variable to different values, and
the switch is dense, we could optimize that to a lookup table on the
values, or even a calculation.  For a trivial, somewhat contrived
example:

   switch (c) {
     case 'a': 
     case 'A': return 10;
     case 'b': 
     case 'B': return 11;
     case 'c': 
     case 'C': return 12;
     case 'd':
     case 'D': return 13;
     case 'e':
     case 'E': return 14;
     case 'f':
     case 'F': return 15;
  }
  return -1;

could be replaced by

  if (c < 0x41 || c > 0x66 || (c > 0x46 && c < 0x61))
    return -1;
  return (c & 0x07) + 9;

when the target character set is (a superset of) ASCII.  

This sort of thing comes up in machine-generated code.  Flex scanners,
for instance, often have nothing to do in their actions other than
return token numbers.

zw



More information about the Gcc-patches mailing list