Target specific back-end cleanups

Old back ends often still show the state and recommended practice from the time when they were first committed. Some of these practices are now deemed obsolete, but cleaning up old back ends unfortunately does not happen often enough. As a result back ends may suffer from bit rot, and the progress of GCC may get slumped when it is desirable that some old back end remains in a working state, while nobody is willing or capable (this happens for old targets!) to bring the back end up to date with current recommended practices. A good example of such a back end is the m68k backend, but there are others.

Below is a list of specific cleanup projects to resync old backends with the currently preferred style of coding backends. There is also information on which back ends require these cleanups.

Convert text peepholes to RTL peepholes.

GCC has two forms of peephole optimization: the old style that edited the text assembly output as it was being generated, and the new style that transforms RTL to RTL. The new form is conceptually cleaner, allows the second scheduling pass to schedule the peepholes, and requires less gunk in the implementation.

Convert text prologues/epilogues to RTL prologues/epilogues.

The same as above, it involves defining special patterns to emit the function prologue and epilogue, instead of using the target macros TARGET_ASM_FUNCTION_PROLOGUE and TARGET_ASM_FUNCTION_EPILOGUE

Convert back-ends to use define_constants

Find magic numbers in .md files and make them use define_constants instead. =define_constants= is not used in a few targets. It is most useful for things like fixed register numbers. Constants defined with it are also visible to C code via the insn-codes.h header.

Convert md files that use CC0 so they don't anymore.

This is hard, but would be a great improvement to the compiler if it were done for all existing targets. The basic idea is that

 (insn ### {cmpsi} (set (cc0) (compare (reg:SI A) (reg:SI B))))
 (insn ### {bgt} (set (pc) (if_then_else
                      (gt (cc0) (const_int 0))
                      (label_ref 23)
                      (pc)))

becomes

 (insn ### {bsicc} (set (pc) (if_then_else
                        (gt:SI (reg:SI A) (reg:SI B))
                        (label_ref 23)
                        (bc)))

Unfortunately, the technique is very poorly documented and may need extending to other conditional operations (setcc, movcc) as well. The cbranch pattern may be your friend.

Some more thoughts on what is involved and how cc0 could be attacked less aggressively: To completely eliminate cc0 compare and branch insns must be merged. Which leads to wanting operate-and-compare-and-branch type instructions. Which may mean branch instructions with output reloads: which means reload changes. Do not go there; it cannot be done by mortals. However, if you have a load-store architecture, there may well be instructions that can store values without affecting the condition codes; with care it may be possible to construct operand-and-cbranch patterns that can handle their own output reloads directly. This has been done in the Thumb port of the ARM backend: see the movsi_cbranchsi4 andsi3_cbranch and bicsi3_cbranch patterns in [gccsource:config/arm/arm.md] for examples. The trick used in this case is to ensure that prior to starting the reload pass, the predicates for the output operands only accept registers (this is relaxed after reload starts to ensure that the patterns continue to match); the constraints for the patterns are also configured so that the register allocator will not see that memory is a possible target either. This is then sufficient to ensure that only spilled registers will need the memory alternatives and these have simple enough memory addresses that they can be handled by input reloads alone.

Something that probably *would* be worthwhile is pushing cc0 until after reload:

  1. use compare-and-branch insns before reload (without operate-and-compare);
  2. split to separate compare-and-branch after reload;
  3. use peephole2 patterns instead of NOTICE_CC for operate-and-compare.

This should be pretty easy, really, except for writing the peephole patterns. But even that wouldn't have to be done in one step. The only question is, which pass would you have eliminate redundant compare instructions? CSE can do this now, but it does not run post-reload. So you'd need peepholes. Many peepholes. And GCC needs to be aware that it has to be very careful about manipulating insn chain after splitting compare-and-branch insns. Fortunately there's not *that* much statement manipulation during late compilation. But peepholes may not be able to do as good a job as current cc0 support does, and compare-and-branch may not be the only uses of cc0 (does anyone know of such a target??). It is not just compare-and-branch that matters. compare-and-store-flag does too. And it may turn out to be a cumbersome way of peephole logic duplication, in every cc0 port.

A macro-like approach, similar to cond_exec could be used to avoid the patterns explosion. If we had something similar to create peepholes between (cc setter,cc user), it would make it reasonable to convert cc0 using ports to something that doesn't use cc0 while still keeping them maintainable and generating efficient code.

Here is a possible approach in which macros are used in the MD file readers to avoid the pattern explosion. This approach is intended to minimize changes to the insn patterns in the existing backends, and to generally make it easier to write backends for processors in which most instructions affect the condition codes.

  1. Modify the programs which read the MD files to look for an attribute named cc0 and a constant named CC_REG The cc0 attribute indicates how the condition codes are affected by the instructions. It might take values like "unchanged", or "clobber", or other values to indicate specific ways that the condition codes are set. When the MD file reader sees an instruction which changes the condition codes, it automatically generates two versions of the instructions: one which adds (clobber (reg:CC CC_REG)) to the instruction and one which adds a set of CC_REG to an unspec (The second version might set CC_REG to a real value based on the setting of the cc0 attribute).

  2. Modify the programs which read the MD files to look for instructions which set cc0 and instructions which use cc0 For each such instruction:

  3. Rewrite the instruction to change cc0 to (reg:CC CC_REG)

  4. Rewrite the condition so that the instruction is only recognized before the cc0 collapse pass and after reload.
    • For each pair of instructions in which one sets cc0 and the other uses cc0

  5. Define a new instruction which combines the cc0 set and the cc0 use into a single combined instruction which does not refer to cc0 or CC_REG at all. This is done by mechanically replacing cc0 in the in the cc0 using instruction by the value to which cc0 is set in the cc0 setting instruction. (Cases in which cc0 is used more than once, or is not simply set to a value, will most likely require manual intervention). This combined instruction will only be recognized after (and during) the cc0 collapse pass and before (and during) reload.

  6. Define a splitter for the new combined instruction, to be run after reload, which splits the combined instruction into the two original instructions.
  7. Write the cc0 collapse pass. This pass is run immediately after RTL expansion. It walks the instruction stream looking for instructions which set and use CC_REG At this point the instructions will always be adjacent. Each such pair of instructions is combined into the combined instruction generated above.

  8. For each target which uses cc0

  9. Define the cc0 attribute for each instruction. The simplest approach will be to set it to "clobber" by default, and to set it to "unchanged" specifically for instructions which do not affect the condition codes.

  10. For instructions in which some alternatives affect the condition codes and some do not, split the instruction after reload into one instruction which affects the condition code and one instruction which does not affect them. Or just write different patterns which are only recognized after reload.
  11. Define CC_REG as a new register which accepts CCmode values, and update the appropriate target macros and hooks.

    • At this point we have eliminated cc0 for the target. The MD file still refers to it, but all such references are translated to refer to CC_REG in the code which the compiler sees. After reload, the instructions which use CC_REG are split into instructions which set CC_REG and instruction which use CC_REG These instructions will be kept as close as they need to be, because most other instructions will clobber CC_REG The generated code should be correct. However, the generated code will not be as good, because there will be unnecessary comparison instructions.

  12. Write a new optimization pass enabled on targets which define NOTICE_UPDATE_CC Probably this pass would be run just before machine dependent reorg, although perhaps there is a better place for it. Walk through the instructions, calling NOTICE_UPDATE_CC on each one. When we find an instruction which sets CC_REG check the source of the set with the current CC status, just as final_scan_insn does now. If the current CC status is the same, delete the instruction which sets CC_REG and, if necessary, replace the earlier instruction with the variant which sets CC_REG rather than clobbering it.

At this point, the code quality should be approximately the same as when the target used cc0

Cleanup {{{TARGET_SETUP_INCOMING_VARARGS}}} interface

This function and its interface are set up to handle both <varargs.h> and <stdarg.h> Since we no longer support the former, we can simplify things a bit. At present, this function is called in the middle of processing the argument list (necessary for varargs), with an additional adjustment to be made if the function actually uses stdargs (all the time now). The goal is to remove this additional adjustment from each of the backends and to simplify the actual function interface.

  1. Move the invocation of assign_parms_setup_varargs out from the middle of the loop in assign_parms This will cause us to see the results of the FUNCTION_ARG_ADVANCE from the last named argument, which means that the backends will no longer have to manually apply that increment themselves.

  2. Visit each backend and remove the manual argument increment. Careful! Some backends have folded the increment into the rest of the address arithmetic into the rest of the calculation.
  3. Revise the interface to be int TARGET_SETUP_INCOMING_STDARG (CUMULATIVE_ARGS *ca) because after the removal of the manual argument increment, the mode and type arguments should be unused. The no_rtl argument is already always false. Transform the pretend_arg_size pointer argument into the return value of the function.

Convert back-ends to use {{{secondary_reload}}} target hook

Targets that use SECONDARY_RELOAD_CLASS or SECONDARY_INPUT_RELOAD_CLASS and/or SECONDARY_OUTPUT_RELOAD_CLASS should be converted to use the =secondary_reload= target hook instead, particularly if they also use reload{in,out}<mode> patterns.

None: general_backend_cleanup (last edited 2011-10-31 16:41:34 by 67)