RTL in cfglayout mode
In RTL there are two representations of the CFG that use the same data structures and, for the most part, the same CFG manipulation functions. The first, and oldest, mode is usually just referred to as cfgrtl mode, and the second, newer representation is _cfglayout_ mode. The semantics of the CFG in these two modes are very different.
Some passes in GCC work in cfgrtl mode, and only a few work in cfglayout mode. There are advantages to working in cfglayout mode, so it is desirable that more passes will be converted from cfgrtl mode to work in cfglayout mode. Unfortunately, it is often not obvious how an existing, usually quite old pass can be modified so that it still works in cfglayout mode.
This page gives a historical back ground of the two RTL cfg modes, and describes some of the issues involved when converting a pass to cfglayout mode.
A historical perspective on cfglayout mode
It may be hard to believe for people who have enjoyed a compiler course somewhere in the last two decades, since the publication of the dragon book, but there was a time not so long ago when GCC did not have a representation for a control flow graph of a function. GCC's representation of a function was a simple list of RTL instructions (the insns stream in GCC jargon). GCC had JUMP_INSN instructions with LABEL_REF operands linking to all referenced CODE_LABEL pseudo-instructions in the insns stream. There were functions like follow_jumps in [gccsource:jump.c] to follow jumps to their eventual target label. This worked somewhat like forward edges in a normal control flow graph, but there was no obvious way to go back up in the function to find all places jumping to some CODE_LABEL The insns stream was a list of all instructions, crossing basic block boundaries and including non insns such as constant pools and jump tables between the basic blocks. Many passes in GCC were written when this was the state-of-the-art compiler technology in GCC, and many passes still assume that the instruction stream is a continuous list with explicit control flow information and no conceptual separation of basic blocks containing instructions and a control flow graph to connect the basic blocks. (The best examples of passes like this is [gccsource:regmove.c], others are gone by now.)
GCC first got a real control flow graph when a new scheduler was contributed by IBM Haifa. This so-called Haifa scheduler implemented the CFG as a web on top of the list of instructions (the insns chain). This implementation was later generalized to be used in the rest of the compiler as well, and the representation of the web got changed to have its own data structures, basic_block and edge in a new file [gccsource:basic-block.h]. The insns stream is still a list of all instructions, but a web of basic blocks and edges is built on top of it and the insns stream itself duplicates all control flow information by explicitly encoding all jumps as instructions. For example, in cfgrtl mode, one of the branches of a conditional jump at the end of a basic block Bmustfall through to the next basic block in the list (i.e. B->next_bb must be a target of the conditional jump). This is what we popularly call cfgrtl mode now.
The "web on top of list" model works quite well for most things, as long as you do not have to shuffle basic blocks around too much. Also, since most of GCC was not at all aware of such a thing as a control flow graph, having the CFG on top of the insns-stream made it easier to gradually transition passes to use the CFG. But for compiler algorithms for many modern code transformations ("optimizations") like loop optimizations, tail duplication, and basic block reordering, cfgrtl mode turned out to be inconvenient. Moving basic blocks around meant updating the underlying insns-stream as well, which is sometimes very hard in RTL. Also, operations like cleaning up the CFG and conversion of IF-THEN-ELSE diamonds to straight-line code is often hard when the insns stream must remain valid (for example when the ELSE branch of an IF-THEN-ELSE diamond is "far" from the IF and THEN basic blocks, then the current implementation of RTL if-conversion in [gccsource:ifcvt.c] can not do its job).
To simplify reordering of basic blocks in [gccsource:bb-reorder.c], a new conceptual model for the CFG was developed:
- basic blocks are connected with edges, and only the edges tell where control flow
- leads to. For example, it is no longer necessary that one of the branches of a conditional jump at the end of a basic block must fall through to the next basic
block in the list. Instead, only BRANCH_EDGE and FALLTHRU_EDGE say which of the edges is the branch and which of the edges is the fallthru-edge.
- leads to. For example, it is no longer necessary that one of the branches of a conditional jump at the end of a basic block must fall through to the next basic
unconditional jumps are not represented at all in the insns stream. The edges
- in the CFG tell you where control flow leads you to at the end of a basic block, so it is unnecessary to have unconditional jumps represented.
non insns such as jump tables and BARRIER insns are no longer present in the
insns stream. The function header and footer, i.e. the bits of the insns stream before and after the actual code, are stored separately, too. There still is a list of all insns to make some of the old bits of the compiler happy that have not yet been converted to use the CFG instead. But the insns stream is incomplete and, frankly, should not be relied on.
In this mode, it is very easy to change the order of the basic blocks, because there effectively is no order at all. The only connection between the basic blocks is through the edges of the CFG. This is the RTL cfg mode that we call cfglayout mode, and it is implemented in [gccsource:cfglayout.h] and [gccsource:cfglayout.c].
Advantages of cfglayout mode
The main advantage of cfglayout mode is that it makes CFG manipulations less complicated. The reason for this is that in cfglayout mode it is OK to have more than one block falling through to another block. In other words, a basic block can have more than one fall-through predecessor edge. This is a change from the usual semantics of a fall-through edge in cfgrtl mode:
- In cfgrtl mode, having more than one fall-through predecessor edge per basic block would
- not make sense. The only sensible way of falling through from one basic block to another
is when the target block contains the next insns in the insns stream, because the CFG in cfgrtl mode is a web on top of the insns stream.
- not make sense. The only sensible way of falling through from one basic block to another
- In cfglayout mode, control flow is described by the CFG only, and you can have as many
- fall-through edges as you like up to the point where you decide on the final layout of the CFG (usually in the basic block reordering pass).
So why is this difference significant:
The biggest problem with CFG manipulations in cfgrtl mode is redirect_edge_edge_and_branch Any pass that operates on the CFG and wants to redirect an edge must:
- Check whether the edge is abnormal; if it is, give up.
- Check whether the edge is a fall-through edge; if it is, make the redirected edge
- not a fall-through edge, using a new basic block as a forwarder block if necessary.
- Check if a new basic block had to be created to avoid double fall-through edges,
- and update data structures that need to be aware of such a new basic block (e.g.
profile information, pass-specific data in a queue or attached to bb->aux etc.). This updating is often not trivial and can usually not be hidden in an API.
- and update data structures that need to be aware of such a new basic block (e.g.
Rechain the insns stream to make it and the order of the basic blocks match.
Redirecting an edge is just one situation where you could create basic blocks with more than one fall-through predecessor edge. It also happens when copying basic blocks. To understand how that works, consider the following situation:
BB1
call X (BB1 --fallthru-> BB2, BB1 --abnormal,eh-> BB3)
BB2
<..>
BB3
exception...Assume you want to duplicate BB1. If you make the edge types the same, you end up with a situation where BB2 has two fall-through predecessors edges:
BB1
call X (BB1 --fallthru-> BB2, BB1 --abnormal,eh-> BB3)
BB4
call X (BB4 --fallthru-> BB2, BB4 --abnormal,eh-> BB3)
BB2
<...>
BB3
exception...This situation is obviously wrong, so at least one new block has to be created to fix up this double fall-through edge problem:
BB1
call X (BB1 --fallthru-> BB5, BB1 --abnormal,eh-> BB3)
BB5
goto label_for_bb2 (BB5 ---> BB2)
BB4
call X (BB4 --fallthru-> BB6, BB4 --abnormal,eh-> BB3)
BB6
goto label_for_bb2 (BB6 ---> BB2)
BB2
label_for_bb2:
<...>
BB3
exception...With cfglayout mode, the problem described above disappears, because it is OK to have multiple fall-through edges coming into a basic block. When going out of cfglayout mode, all issues like this are resolved in a single step when the final layout of the CFG is determined.
Another small advantage of cfglayout mode over cfgrtl mode, is that some duplication of information is removed. From an optimization point of view, ordinary (i.e. local, not computed) unconditional jumps do not add any useful information to the intermediate language. In fact, theyduplicateinformation, because the only effect of a jump is to transfer control to another point in the program, and this information is already encoded as edges in the control flow graph. In cfglayout mode, you don't need to have a JUMP_INSN when you can have a fall-through edge.
Typical problems encountered when converting old RTL passes to cfglayout mode
TODO