This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
CFG documentation
- From: Jan Hubicka <jh at suse dot cz>
- To: gcc-patches at gcc dot gnu dot org, rth at cygnus dot com,gcc-pdo at atrey dot karlin dot mff dot cuni dot cz
- Date: Fri, 31 May 2002 12:43:29 +0200
- Subject: CFG documentation
Hi,
this patch adds the documentation we wrote as part of our project into
gccint manual. Hope it is usefull, I am open to all suggestions about
improving it and I am also open to kill parts of it if it appears to
be not needed.
Fri May 31 12:40:57 CEST 2002 Jan Hubicka <jh@suse.cz>
Josef Zlomek <zlomek@matfyz.cz>
* Makefile.in (gccint.info): Add cfg.texi dependency.
* cfg.texi: New file.
* gccint.texi: Include it.
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/egcs/gcc/Makefile.in,v
retrieving revision 1.882
diff -c -3 -p -r1.882 Makefile.in
*** Makefile.in 28 May 2002 17:32:04 -0000 1.882
--- Makefile.in 31 May 2002 10:40:45 -0000
*************** $(docdir)/gccint.info: $(docdir)/gccint.
*** 2330,2336 ****
$(docdir)/headerdirs.texi $(docdir)/include/funding.texi \
$(docdir)/gnu.texi $(docdir)/include/gpl.texi \
$(docdir)/include/fdl.texi $(docdir)/contrib.texi \
! $(docdir)/languages.texi $(docdir)/sourcebuild.texi
cd $(srcdir) && $(MAKEINFO) $(MAKEINFOFLAGS) -I doc -I doc/include -o doc/gccint.info doc/gccint.texi
$(docdir)/cppinternals.info: $(docdir)/cppinternals.texi
--- 2330,2337 ----
$(docdir)/headerdirs.texi $(docdir)/include/funding.texi \
$(docdir)/gnu.texi $(docdir)/include/gpl.texi \
$(docdir)/include/fdl.texi $(docdir)/contrib.texi \
! $(docdir)/languages.texi $(docdir)/sourcebuild.texi \
! $(docdir)/cfg.texi
cd $(srcdir) && $(MAKEINFO) $(MAKEINFOFLAGS) -I doc -I doc/include -o doc/gccint.info doc/gccint.texi
$(docdir)/cppinternals.info: $(docdir)/cppinternals.texi
Index: doc/cfg.texi
===================================================================
RCS file: doc/cfg.texi
diff -N doc/cfg.texi
*** /dev/null 1 Jan 1970 00:00:00 -0000
--- doc/cfg.texi 31 May 2002 10:40:46 -0000
***************
*** 0 ****
--- 1,356 ----
+ @c Copyright (C) 2001 Free Software Foundation, Inc.
+ @c This is part of the GCC manual.
+ @c For copying conditions, see the file gcc.texi.
+
+ @c ---------------------------------------------------------------------
+ @c Control Flow Graph
+ @c ---------------------------------------------------------------------
+
+ @node CFG
+
+ @chapter Control Flow Graph
+ @cindex CFG, Control Flow Graph
+
+ A control flow graph is a data structure built on top of the intermediate code
+ representation (RTL instruction chain or trees) abstracting the control flow
+ behavior of compiled function. It is an oriented graph where nodes are basic
+ blocks and edges represent possible control flows from one basic block to
+ another.
+
+ @menu
+ * Basic Blocks:: The basic block definition
+ * Edges:: Types of edges and their representation
+ * The Profile:: Profile information maintained
+ * Maintaining CFG up to Date:: How to properly update CFG datastructure
+ * Maintaining Profile up to Date:: How to properly update profile
+ * Liveness Information:: Using and maitaining register liveness
+ @end menu
+
+ @node Basic Blocks
+ @section Basic Blocks
+
+ @cindex basic block
+ @findex basic_block
+ Basic block is a straight-line sequence of code that can be entered only at
+ the beginning and leaved only at the end. Basic blocks are represented
+ using @code{basic_block} data type that contains pointers to the head (first
+ instruction of the basic block) and the end (last instruction) as well as other
+ information maintained about each block.
+
+ @findex NOTE_INSN_BASIC_BLOCK, CODE_LABEL
+ In the RTL function representation, the head pointer always points either to
+ @code{NOTE_INSN_BASIC_BLOCK} or to @code{CODE_LABEL}, if present. Basic block
+ ends by control flow instruction or last instruction before following
+ @code{CODE_LABEL}.
+
+ @findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR
+ Special basic blocks @code{ENTRY_BLOCK_PTR} and @code{EXIT_BLOCK_PTR} represent
+ possible entry points and exits from the compiled function.
+
+ @findex next_bb, prev_bb, FOR_EACH_BB
+ Pointers @code{next_bb} and @code{prev_bb} are used to keep doubly linked chain
+ of basic blocks in the same order as the underlying RTL instruction chain has
+ and are updated transparenly by the functions manupulating the CFG. The macro
+ @code{FOR_EACH_BB} can be used to visit all this order.
+
+ @findex BASIC_BLOCK
+ The @code{BASIC_BLOCK} array contains all basic blocks in the unspecified
+ order. Last basic block has index @code{last_basic_block_index} in the insn
+ stream.
+
+ The RTL instruction stream contains not only the ``real'' instructions, but
+ also notes. Notes may or may not appear inside basic block. Any function that
+ moves or duplicates the basic blocks needs to take care of updating of these
+ notes. Many of notes expect that code consists of linear regions making such
+ updates difficult.
+
+ @findex can_fallthru
+ Additionally the jump table vectors are represented as ``instructions'' inside
+ the insn chain. These vectors never appear in the basic block and should be
+ always placed just after table jump instructions referencing them. After
+ removing the table-jump it is difficult to eliminate the code computing address
+ and referencing the vector, so cleaning up the vectors is postponed to liveness
+ analysis and thus the vectors may appear in the insn chain without any purpose.
+ Before any edge is made fall-thru, the existence of such construct in the way
+ needs to be checked by calling @code{can_fallthru} function.
+
+ @node Edges
+ @section Edges
+
+ @cindex edge in the flow graph
+ @findex edge
+ Edges represent possible control flow transfers from the end of basic block to
+ the head of another. Single linked lists of edges to predecessors and successors
+ are maintained for each basic block.
+
+ In common case edges correspond to branches or ``fall-thru'' edges to the following
+ basic block, but there are several other reasons why edge may be created. The
+ type of edge can be obtained via @code{flags} field.
+
+ There are following types:
+
+ @table @emph
+ @item jumps
+
+ Edges corresponding to destinations of jump instructions have no flags
+ defined. These edges are used for unconditional or conditional jumps
+ and table jumps and are most convenient to manipulate with as they
+ may be freely redirected.
+
+ @item fall-thru
+
+ @findex EDGE_FALLTHRU
+ Fall-thru edges are present in case the basic block may continue execution to
+ the following one without branching and do have @code{EDGE_FALLTHRU} flag set.
+
+ @findex force_nonfallthru
+ Unlike other types of edges, these edges must come into following basic block
+ in the insn stream and thus function @code{force_nonfallthru} is available to
+ insert jump in the case that redirection is needed. This may require creation
+ of a new basic block.
+
+ @item exception handling
+
+ @cindex exception handling
+ Exception handling edges represent possible control transfers from trap to the
+ exception handler. The definition of trap varies. In C++ only function calls
+ can throw, but for Java exceptions like division by zero or segmentation fault
+ are defined and thus each instruction possibly throwing this kind of exception
+ needs to be handled as control flow instruction.
+
+ @findex REG_EH_REGION
+ The destination of edge is specified by @code{REG_EH_REGION} note attached to
+ the insn.
+
+ @findex EDGE_EH, EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
+ The flags of such notes set @code{EDGE_EH} and @code{EDGE_ABNORMAL}. In case
+ of the call @code{EDGE_ABNORMAL_CALL} flag is set too.
+
+ When updating the insn stream it is easy to change possibly trapping
+ instruction to non-trapping. Opposite conversion is difficult and should not
+ happen. Predicate @code{may_trap_p} may be used to check whether
+ instruction still may trap or not. The edges can be eliminated via
+ @code{purge_dead_edges} call.
+
+ @item sibling calls
+
+ @cindex sibling call
+ @findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
+ Sibling calls terminate the function in a non-standard way and thus an edge
+ to the exit must be present. @code{EDGE_ABNORMAL_CALL} and @code{EDGE_ABNORMAL}
+ are set in such case.
+
+ @item computed jumps
+
+ @cindex computed jump
+ @findex EDGE_ABNORMAL
+ Computed jumps contain edges to all labels in the function referenced from
+ the code. All those edges have @code{EDGE_ABNORMAL} flag set. The edges
+ used to represent computed jumps often cause compile time performance
+ problems, since functions consisting of many taken labels and many computed
+ jumps may have very dense flow graphs, so these edges need to be handled
+ with special care.
+
+ @item nonlocal goto handlers
+
+ @cindex nonlocal goto handler
+ @findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
+
+ GCC allows nested functions to return into caller using @code{goto} statement
+ referring to label passed to as an argument. The labels passed to nested
+ functions contain special code to cleanup after function call. Such section of
+ code is referred as ``nonlocal goto receivers''. In the case function
+ contains such nonlocal goto receivers, the edge from the call to label is
+ present having @code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set.
+
+ @item function entry points
+
+ @cindex function entry point, alternate function entry point
+ @findex LABEL_ALTERNATE_NAME
+ By definition, execution of function starts by basic block 0, so there is
+ always an edge from entry block to the first real basic block. The alternate
+ entry points are specified by @code{CODE_LABEL} with
+ @code{LABEL_ALTERNATE_NAME} defined. This feature is currently used for
+ multiple entry point prologues and is limited to post-reload passes only. This
+ can be used by back-ends to emit alternate prologues for functions called from
+ different contexts. In future full support for multiple entry functions
+ defined by Fortran 90 needs to be implemented.
+
+ @item function exits
+
+ In the pre-reload representation function terminates by the last instruction in
+ the insn chain and no explicit return instructions are used. This corresponds
+ to the fall-thru edge into exit block. After reload optimal RTL epilogues
+ are used, that use explicit (conditional) return instructions that are
+ represented by edges with no flags set.
+
+ @end table
+
+ @node The Profile
+ @section The Profile
+
+ @findex Profile representation
+ In many cases compiler must make choice whether to trade speed in one part of
+ code for speed in another, or trade code size for code speed. In such cases it
+ is useful to know information about how often given block executes and
+ that is the purpose for maintaining profile within flow graph.
+
+ GCC allows the profile to be either feedback based or statically estimated.
+
+ The feedback based profile is produced by compiling the program with
+ instrumentation, executing it on the train run and reading the numbers of
+ executions of basic block edges back to the compiler while
+ re-compiling the program to produce final executable. This method provides
+ very accurate information about where program spends most of time on the train
+ run. Whether it matches the average run depends, of course, on the choice of
+ train data set, but several studies has shown, that the behavior of program
+ usually changes just marginally over different data sets.
+
+ @findex predict.def
+ @cindex Static profile estimation
+ When profile feedback is not available, compiler attempts to predict the
+ behavior of each branch in the program using a set of heuristics (see
+ @code{predict.def} for details) and compute estimated frequencies of
+ each basic block by propagating the probabilities over the graph.
+
+ @findex frequency, count, BB_FREQ_BASE
+ Each basic block contains two fields --- @code{frequency} and @code{count}.
+ Frequency is an estimation how often is basic block executed within a
+ function and is represented as integer scaled in the range
+ 0--@code{BB_FREQ_BASE}. Most frequent basic block in function is initially set
+ to @code{BB_FREQ_BASE} and rest of frequencies are scaled according to that.
+ During optimization, the frequency of most frequent basic block can both
+ decrease (for instance by loop unrolling) or grow (for instance by
+ cross-jumping optimization).
+
+ The @code{count} contains number of execution measured during training run and
+ is nonzero only when profile feedback is available. This value is represented
+ as 64bit integer. Most optimization passes can use only the frequencies of
+ basic block, while few passes may want to know exact counts. The frequencies
+ should always match the counts after scaling, however during updating of the
+ profile information numerical error may accumulate into quite large errors.
+
+ @findex REG_BR_PROB_BASE, EDGE_FREQUENCY
+ Similarly each edge contains probability field---an integer in range
+ from 0 to @code{REG_BR_PROB_BASE}. It represents probability of passing control from
+ the end of source basic block to the destination. The probability that control
+ flow arrives via given edge to the destination basic block is called reverse
+ probability and is not directly represented, but it may be easily computed
+ from frequencies of basic blocks. @code{EDGE_FREQUENCY} macro is available to
+ compute how frequently is given edge taken. @code{count} field is present for
+ each edge as well, representing same information as for basic block.
+
+ The basic block frequencies are not represented in the RTL instruction stream,
+ the edge frequencies are represented only for conditional jump via
+ @code{REG_BR_PROB}, since they are used when instructions are output to
+ the assembly file and flow graph is no longer maintained.
+
+ @node Maintaining CFG up to Date
+ @section Maintaining CFG up to Date
+
+ Important task is to keep both control flow graph and profile up-to-date with
+ the instruction stream during optimization passes. Reconstruction of control
+ flow graph after each pass is not an option, as it is too expensive and we
+ lose profile information.
+
+ @findex BLOCK_FOR_INSN
+ At the moment, the basic block boundaries are maintained transparently during
+ emitting instruction, so rarely there is need to move them manually (such as in
+ case someone wants to output instruction outside basic block explicitly). Each
+ instruction has defined @code{BLOCK_FOR_INSN} value that represents pointer to the
+ basic block owning it, so the basic block list may be better viewed as integral
+ part of instruction chain, than structure built on the top of it.
+
+ @findex purge_dead_edges
+ Updating of edges is not transparent and optimization pass is required to do
+ that manually. However only few cases occur in practice. Commonly the
+ optimization pass simplifies the instruction possibly eliminating some edge.
+ This may happen by simplifying the conditional jump into unconditional, but
+ also by simplifying possibly trapping instruction to non-trapping while
+ compiling Java. The pass may call @code{purge_dead_edges} on given basic block
+ to remove unneeded edges, if any.
+
+ @findex redirect_edge_and_branch, redirect_jump
+ Other common scenario is redirection of branch instructions, but this is best
+ modeled as redirection of edges in the control flow graph and thus use of
+ @code{redirect_edge_and_branch} is preferred over more low level functions,
+ such as @code{redirect_jump} that operate on RTL chain only.
+
+ @findex find_sub_basic_blocks, split_block
+ Last common case is inserting control flow instruction into middle of basic
+ block. The @code{find_sub_basic_blocks} may be used to split existing basic
+ block and add necessary edges, or @code{split_block} may be used when
+ instruction in middle of basic block wants to become target of branch
+ instruction.
+
+ @findex commit_insn_to_edge, commit_edge_insertions
+ For global optimizer, a common operation is to split edges in the flow graph
+ and insert instruction to them. This can be easily done via function
+ @code{commit_insn_to_edge} that emits instruction ``to the edge'' caching it
+ for later @code{commit_edge_insertions} call that will care creation of new
+ basic block where needed and flushing the instruction to actual instruction
+ stream.
+
+ While debugging the optimization pass, an @code{verify_flow_info} function may
+ be useful to find bugs in the control flow graph updating code.
+
+ @node Maintaining Profile up to Date
+ @section Maintaining Profile up to Date
+
+ @findex redirect_edge_and_branch
+ More delicate task than updating control flow graph is to update profile. Many
+ of the function to modify flow graph, like @code{redirect_edge_and_branch} do
+ not have enough information to easily update profile, so updating profile is
+ in majority cases left on the caller. Since it is difficult to discover bugs in
+ the profile updating code, as they manifest themselves only by producing worse
+ code and checking profile consistency is not possible, because of numeric
+ error accumulation, special care needs to be taken into this issue.
+
+ @findex REG_BR_PROB_BASE, BB_FREQ_BASE, count
+ It is important to point out, that @code{REG_BR_PROB_BASE} and
+ @code{BB_FREQ_BASE} are both set low enough to be possible to compute second
+ power of any frequency or probability in the flow graph, it is not possible to
+ even square the @code{count} field, as modern CPUs are fast enough to execute
+ $2^32$ operations quickly.
+
+ @node Liveness Information
+ @section Liveness Information
+ @cindex Liveness representation
+
+ Liveness information is useful to determine whether register X is ``live'' at
+ given point of program, that means that it contains important value. This
+ information is used, for instance, during register allocation pass, as the
+ pseudo registers need to be assigned to unique hard register or stack slot only
+ when they are live. The hard registers and stack slots may be freely reused
+ for other values when they are dead.
+
+ @findex REG_DEAD, REG_UNUSED
+ The liveness information is stored partly in the RTL instruction chain and
+ partly in the flow graph. RTL chain stores local information: each instruction
+ may contain @code{REG_DEAD} note representing that value of given register is
+ no longer needed or @code{REG_UNUSED} note representing that the value computed
+ by instruction is never used. The second is useful for instructions computing
+ multiple values at once.
+
+ @findex global_live_at_start, global_live_at_end
+ Each basic block contains bitmaps representing liveness of each register at
+ entry and exit of basic block (@code{global_live_at_start} and
+ @code{global_live_at_end}). @code{flow.c} contains function to compute
+ liveness of each register at any given place in the instruction stream using
+ this information.
+
+ @findex BB_DIRTY, clear_bb_flags, update_life_info_in_dirty_blocks
+ Liveness is expensive to compute and thus it is desirable to keep it up to date
+ during optimization passes. This can be easily accomplished using @code{flags}
+ field of basic block. The functions modifying instruction stream automatically
+ set @code{BB_DIRTY} flag of basic block, so the pass may simply use
+ @code{clear_bb_flags} before doing any modifications and then ask dataflow
+ modulule via function @code{update_life_info_in_dirty_blocks} to get liveness
+ updated.
+
+ This scheme works reliably as long as no control flow graph transformations are
+ done. The task of updating liveness after control flow graph changes is more
+ difficult as normal iterative data flow may produce invalid results or get into
+ cycle when the initial solution is not bellow the desired one. Only simple
+ transformations, like splitting basic blocks or emitting to the edge are safe,
+ as functions to implement them already know how to update liveness locally.
Index: doc/gccint.texi
===================================================================
RCS file: /cvs/gcc/egcs/gcc/doc/gccint.texi,v
retrieving revision 1.4
diff -c -3 -p -r1.4 gccint.texi
*** doc/gccint.texi 23 Jan 2002 17:30:28 -0000 1.4
--- doc/gccint.texi 31 May 2002 10:40:46 -0000
*************** Additional tutorial information is linke
*** 169,174 ****
--- 169,175 ----
* Fragments:: Writing the @file{t-@var{target}} and @file{x-@var{host}} files.
* Collect2:: How @code{collect2} works; how it finds @code{ld}.
* Header Dirs:: Understanding the standard header file directories.
+ * CFG:: The control flow graph implementation.
* Funding:: How to help assure funding for free software.
* GNU Project:: The GNU Project and GNU/Linux.
*************** Additional tutorial information is linke
*** 196,201 ****
--- 197,203 ----
@include fragments.texi
@include collect2.texi
@include headerdirs.texi
+ @include cfg.texi
@include funding.texi
@include gnu.texi