[ast-optimizer-branch] Goto/Break/Continue/Switch elimination
Zack Weinberg
zack@codesourcery.com
Fri May 3 10:21:00 GMT 2002
On Thu, May 02, 2002 at 08:59:43PM -0600, Roger Sayle wrote:
>
> One possible solution to the potential performance pitfalls is to
> implement a "Break/Continue/Switch induction" pass.
>
> This pass should hopefully transform SIMPLE back into a higher-level
> structured control flow. These techniques are normally used to improve
> code readability in reverse engineering projects, for example making
> the AST pretty printer output much easier to comprehend. For GCC, it
> would allow simplification of control flow for some optimizations,
> whilst reducing the size of a program's representation for others.
Yes, this sounds like a good plan to me.
> From a code generation perspective there are many reasons for keeping
> a switch statement intact all the way to the backend, including to make
> use of the hardware's "casei" and "tablejump" instructions. The current
> scheme already performs switch statement decomposition too early, i.e.
> before the bounds have been modified by VRP. Indeed, rather than
> decompose switches to perfectly balanced binary trees, a backend should
> skew the branches based upon the ratio of branch-taken/branch-not-taken
> timings including prediction hints. [The current scheme of using
> sequential tests for switch statements with very few cases is really
> just a special case of tree skewing!]
And there are some other situations where special optimization of case
statements might be interesting. For instance, when all of the
branches of a switch set the same variable to different values, and
the switch is dense, we could optimize that to a lookup table on the
values, or even a calculation. For a trivial, somewhat contrived
example:
switch (c) {
case 'a':
case 'A': return 10;
case 'b':
case 'B': return 11;
case 'c':
case 'C': return 12;
case 'd':
case 'D': return 13;
case 'e':
case 'E': return 14;
case 'f':
case 'F': return 15;
}
return -1;
could be replaced by
if (c < 0x41 || c > 0x66 || (c > 0x46 && c < 0x61))
return -1;
return (c & 0x07) + 9;
when the target character set is (a superset of) ASCII.
This sort of thing comes up in machine-generated code. Flex scanners,
for instance, often have nothing to do in their actions other than
return token numbers.
zw
More information about the Gcc-patches
mailing list