CC0 transition
There are two ways of representing the condition codes in the back-end for an architecture: the old-fashioned, deprecated way called CC0 representation and the modern, mandatory for new back-ends, way called MODE_CC representation. As of now, most back-ends have been converted to the MODE_CC representation, but a few of them still use the CC0 representation, as summarized in Status of supported architectures.
CC0 vs MODE_CC representation
The following table summarizes the main differences between the 2 models:
|
CC0 representation |
MODE_CC representation |
Condition Code object |
single special cc0_rtx |
one or several hard registers |
Implicit clobbering of Condition Code |
Yes |
No |
Separation of CC setters and users |
No |
Yes |
Condition Code vs Reload issue |
No |
Yes |
Optimization |
NOTICE_UPDATE_CC |
SELECT_CC_MODE |
Condition Code object
In the new model, condition codes are represented by one of several usual hard registers, which can be fixed or subject to register allocation, call-used or call-preserved, etc. Values are represented in one or several machine modes of class MODE_CC, the default one being CCmode.
Implicit clobbering of Condition Code
In the old model, any RTL instruction that doesn't set cc0_rtx implicitly clobbers it, which means that patterns in the MD files don't need to have explicit CLOBBERs of cc0_rtx. On the contrary, there is no implicit clobbering in the new model so patterns in the MD files for instructions that change the condition codes must have an explicit SET or CLOBBER of a CC register, usually by means of a PARALLEL.
Separation of CC setters and users
The RTL subsystem guarantees that instructions setting cc0_rtx and instructions using it are never separated once they are emitted in the RTL stream, i.e. they are always consecutive (modulo inactive RTL instructions like e.g. NOTEs). This guarantee falls in the new model, which means that RTL passes are free to reorder instructions setting or using condition codes, as long as the usual data dependencies expressed in the patterns are preserved.
Condition Code vs Reload issue
The above guarantee provided by the old model is universal and the Reload pass of the register allocator abides by it. Things are fundamentally different in the new model: since there can be any number of instructions between a set and a use of the condition codes, e.g. a compare and a branch, the Reload pass may need to emit a move or a load or a store or even an instruction computing an address between them. This creates the following dichotomy: either the architecture contains move and load and store and simple integer arithmetical instructions that do not clobber the condition codes and Reload will be well-behaved (case #1, e.g. SPARC) or it lacks such instructions and Reload will be problematic (case #2, e.g. Visium).
Optimization
The compiler can eliminate redundant compare instructions initially emitted in the RTL stream. In the old model, that's an ad-hoc optimization executed during the final RTL pass and parameterized by the NOTICE_UPDATE_CC macro. In the new model, it's either done by the combine pass (case #1 above) or by the post-reload compare elimination pass (case #2 above), both parameterized by the SELECT_CC_MODE macro.
Conversion from CC0 to MODE_CC representation
Preparation
Some amount of planning ahead of the actual conversion is probably required to avoid putting oneself in a corner. Here's a list of the main choices to be made:
- number and properties of the CC hard register(s): the straightforward case is the x86 architecture, where the condition codes are implemented as a single set of flags, so a single fixed hard register is sufficient; a more complex case is SPARC, where integer and floating-point conditions codes are separate and the latter can use 4 different slots, so 1 fixed and 4 call-used hard registers are declared in total.
number of MODE_CC modes: CCmode is always defined but other modes may be required, for example if signed and unsigned integer compares are distinct instructions like for PowerPC. The mode of a CC register encodes the kind of comparison that was used to set the register; it ought to be the same in the setter and the user instructions, e.g. a compare and a branch, when they are initially emitted in the RTL stream.
identification of the case for the Condition Code vs Reload issue: it's fundamental since the differences between the two cases are significant. For most architectures, loads and stores don't affect the condition codes; that's also generally true for floating-point move instructions, so only integer move and addition/subtraction instructions really matter. CISC (e.g. x86) and most RISC architectures (e.g. PowerPC and SPARC) have variants of these that clobber and variants that don't clobber the condition codes, but very embedded RISC architectures (e.g. Visium) may have only the former.
Implementation
addition and parameterization of the new CC hard register(s); if some of them are not fixed, addition of operand predicate(s) in predicates.md and optionally constraint(s) in constraints.md.
optional declaration of additional MODE_CC modes in <arch>-modes.def.
case #1:
The Reload pass is well-behaved so the conversion is relatively straightforward, but a prerequisite is to verify and adjust, if need be, the instructions emitted by the default integer move and addition/subtraction patterns: they may not clobber the condition codes.
replacement in the cbranch and cstore patterns (and other patterns if need be) of cc0_rtx by CC register(s) using appropriate MODE_CC modes: for the former patterns, CCmode is usually sufficient but, for the latter, if the architecture doesn't have real setcc instructions, special modes might be needed.
- same replacement in the instructions explicitly setting or using condition codes, e.g. compare instructions:
(define_insn "*comparesi" [(set (reg:CC CC_REG) (compare:CC (match_operand:SI 0 "register_operand" "r") (match_operand:SI 1 "register_operand" "r")))] "" "cmp %0, %1")
for every other instruction changing condition codes, addition of an explicit CLOBBER of a CC register by means of a PARALLEL.
for optimization purposes, optional addition for every instruction just changed above of a twin pattern where the CLOBBER of the CC register in the PARALLEL is replaced by a SET of the same CC register in a specific MODE_CC mode; for historical reasons, the SET needs to be the first element in the PARALLEL. The typical example is an integer addition instruction setting the condition codes:
(define_insn "*addsi_clobber_flags" [(set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_operand:SI 1 "register_operand" "%r") (match_operand:SI 2 "arith_operand" "rI"))) (clobber (reg:CC CC_REG))] "" "addcc %0, %1, %2") (define_insn "*addsi_set_flags" [(set (reg:CCNZ CC_REG) (compare:CCNZ (plus:SI (match_operand:SI 1 "register_operand" "%r") (match_operand:SI 2 "arith_operand" "rI")) (const_int 0))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_dup 1) (match_dup 2)))] "" "addcc %0, %1, %2")
The MODE_CC mode of the register in the CLOBBER doesn't really matter so CCmode is used. On the contrary, the mode of the register in the SET is significant and encodes the implicit comparison of the result of the addition with zero done by the instruction; but since the instruction is not a fully-fledged compare instruction, it doesn't set the CC register as a fully-fledged compare instruction would, so it cannot use CCmode but needs another MODE_CC mode (for example, CCNZmode here means that only the 'N' and 'Z' flags of the CC register are valid). Note that the process can be automated by means of the define_subst construct of the MD language.
for optimization purposes, optional addition of the SELECT_CC_MODE macro for use by the combine pass if additional MODE_CC modes have been defined. It needs to return the appropriate MODE_CC mode given the operands of a comparison; for example, given the above patterns, it needs to return CCNZmode if operand #0 is a PLUS and operand #1 is zero.
for optimization purposes, optional definition of the TARGET_FIXED_CONDITION_CODE_REGS hook.
optional adjustment of the instructions using the condition codes, e.g. branches, to the MODE_CC mode of the CC register; different variants of the instruction may be needed depending on the mode.
case #2:
The Reload pass is problematic because it can emit instructions clobbering the condition codes anywhere. That's why, until after it is completed, there may not be any instruction setting or using the CC register(s) in the RTL stream; only CLOBBERs are permitted at this point.
rewrite of the cbranch and cstore patterns (and other patterns if need be) so as not to expose the CC register(s) before Reload, and addition of post-reload splitters (splitters enabled only when the reload_completed condition is true) to expose the CC register(s) using appropriate MODE_CC modes: for the former patterns, CCmode is usually sufficient but, for the latter, if the architecture doesn't have real setcc instructions, special modes might be needed:
(define_expand "cbranchsi4" [(set (pc) (if_then_else (match_operator 0 "ordered_comparison_operator" [(match_operand:SI 1 "register_operand") (match_operand:SI 2 "register_operand")]) (label_ref (match_operand 3 "")) (pc)))] "") (define_insn_and_split "*cbranchsi4_insn" [(set (pc) (if_then_else (match_operator 0 "ordered_comparison_operator" [(match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")]) (label_ref (match_operand 3 "")) (pc)))] "" "#" "reload_completed" [(set (reg:CC CC_REG) (compare:CC (match_dup 1) (match_dup 2))) (set (pc) (if_then_else (match_op_dup 0) [(reg:CC CC_REG) (const_int 0)]) (label_ref (match_dup 3)) (pc)))] "")
replacement of cc0_rtx by CC register(s) using appropriate MODE_CC modes in the instructions explicitly setting or using condition codes, e.g. compare instructions., and addition of the reload_completed condition:
(define_insn "*comparesi" [(set (reg:CC CC_REG) (compare:CC (match_operand:SI 0 "register_operand" "r") (match_operand:SI 1 "register_operand" "r")))] "reload_completed" "cmp %0, %1")
for every other instruction changing condition codes, addition of an explicit CLOBBER of a CC register by means of a PARALLEL or addition of a post-reload splitter that adds such a CLOBBER after Reload; if most instructions clobber the condition codes, it might be better to choose the latter approach so as to leave more leeway to the RTL optimization passes running before Reload.
for optimization purposes, optional addition for every instruction just changed above of a twin pattern where the CLOBBER of the CC register in the PARALLEL is replaced by a SET of the same CC register in a specific MODE_CC mode; for historical reasons, the SET needs to be the first element in the PARALLEL. The typical example is an integer addition instruction setting the condition codes:
(define_insn "*addsi_clobber_flags" [(set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_operand:SI 1 "register_operand" "%r") (match_operand:SI 2 "arith_operand" "rI"))) (clobber (reg:CC CC_REG))] "reload_completed" "addcc %0, %1, %2") (define_insn "*addsi_set_flags" [(set (reg:CCNZ CC_REG) (compare:CCNZ (plus:SI (match_operand:SI 1 "register_operand" "%r") (match_operand:SI 2 "arith_operand" "rI")) (const_int 0))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_dup 1) (match_dup 2)))] "reload_completed" "addcc %0, %1, %2")
The MODE_CC mode of the register in the CLOBBER doesn't really matter so CCmode is used. On the contrary, the mode of the register in the SET is significant and encodes the implicit comparison of the result of the addition with zero done by the instruction; but since the instruction is not a fully-fledged compare instruction, it doesn't set the CC register as a fully-fledged compare instruction would, so it cannot use CCmode but needs another MODE_CC mode (for example, CCNZmode here means that only the 'N' and 'Z' flags of the CC register are valid). Note that the process can be automated by means of the define_subst construct of the MD language.
for optimization purposes, optional addition of the SELECT_CC_MODE macro for use by the post-reload compare elimination pass if additional MODE_CC modes have been defined. It needs to return the appropriate MODE_CC mode given the operands of a comparison; for example, given the above patterns, it needs to return CCNZmode if operand #0 is a PLUS and operand #1 is zero.
for optimization purposes, optional definition of the TARGET_FLAGS_REGNUM hook to activate the post-reload compare elimination pass.
optional adjustment of the instructions using the condition codes, e.g. branches, to the MODE_CC mode of the CC register; different variants of the instruction may be needed depending on the mode.
Finalization
- comparison of the code quality, typically at -O2, before and after the conversion: there should be only very rare cases of pessimization (for example, a redundant comparison not eliminated) and overall better scheduling of the instructions.
optional addition or adjustment of the TARGET_MD_ASM_ADJUST hook to automatically add a CLOBBER of the CC register(s) to the user inline asm instructions for the sake of compatibility with the CC0 representation.
If you have any questions, please contact me: Eric Botcazou ebotcazou@adacore.com