This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Looping pattern docs



Yonks ago I submitted the following documentation on looping patterns.
Is it OK to commit?

Michael.

Index: md.texi
===================================================================
RCS file: /cvs/gcc/egcs/gcc/md.texi,v
retrieving revision 1.52
diff -c -3 -p -r1.52 md.texi
*** md.texi	2000/12/04 18:42:59	1.52
--- md.texi	2000/12/17 00:10:10
*************** See the next chapter for information on 
*** 32,37 ****
--- 32,38 ----
  * Pattern Ordering::    When the order of patterns makes a difference.
  * Dependent Patterns::  Having one pattern may make you need another.
  * Jump Patterns::       Special considerations for patterns for jump insns.
+ * Looping Patterns::    How to define patterns for special looping insns.
  * Insn Canonicalizations::Canonicalization of Instructions
  * Expander Definitions::Generating a sequence of several RTL insns
                            for a standard operation.
*************** table it uses.  Its assembler code norma
*** 2597,2602 ****
--- 2598,2644 ----
  second operand, but you should incorporate it in the RTL pattern so
  that the jump optimizer will not delete the table as unreachable code.
  
+ 
+ @cindex @code{decrement_and_branch_until_zero} instruction pattern
+ @item @samp{decrement_and_branch_until_zero}
+ Conditional branch instruction that decrements a register and
+ jumps if the register is non-zero.  Operand 0 is the register to
+ decrement and test; operand 1 is the label to jump to if the
+ register is non-zero.  @xref{Looping Patterns}
+ 
+ This optional instruction pattern is only used by the combiner,
+ typically for loops reversed by the loop optimizer when strength
+ reduction is enabled.
+ 
+ @cindex @code{doloop_end} instruction pattern
+ @item @samp{doloop_end}
+ Conditional branch instruction that decrements a register and jumps if
+ the register is non-zero.  This instruction takes five operands: Operand
+ 0 is the register to decrement and test; operand 1 is the number of loop
+ iterations as a @code{const_int} or @code{const0_rtx} if this cannot be
+ determined until run-time; operand 2 is the actual or estimated maximum
+ number of iterations as a @code{const_int}; operand 3 is the number of
+ enclosed loops as a @code{const_int} (an innermost loop has a value of
+ 1); operand 4 is the label to jump to if the register is non-zero.
+ @xref{Looping Patterns}
+ 
+ This optional instruction pattern should be defined for machines with
+ low-overhead looping instructions as the loop optimizer will try to
+ modify suitable loops to utilize it.  If nested low-overhead looping is
+ not supported, use a @code{define_expand} (@pxref{Expander Definitions})
+ and make the pattern fail if operand 3 is not @code{const1_rtx}.
+ Similarly, if the actual or estimated maximum number of iterations is
+ too large for this instruction, make it fail.
+ 
+ @cindex @code{doloop_begin} instruction pattern
+ @item @samp{doloop_begin}
+ Companion instruction to @code{doloop_end} required for machines that
+ need to perform some initialisation, such as loading special registers
+ used by a low-overhead looping instruction.  If initialisation insns do
+ not always need to be emitted, use a @code{define_expand}
+ (@pxref{Expander Definitions}) and make it fail.
+ 
+ 
  @cindex @code{canonicalize_funcptr_for_compare} instruction pattern
  @item @samp{canonicalize_funcptr_for_compare}
  Canonicalize the function pointer in operand 1 and store the result
*************** discussed above, we have the pattern
*** 3075,3080 ****
--- 3117,3227 ----
  
  The @code{SELECT_CC_MODE} macro on the Sparc returns @code{CC_NOOVmode}
  for comparisons whose argument is a @code{plus}.
+ 
+ @node Looping Patterns
+ @section Defining Looping Instruction Patterns
+ @cindex looping instruction patterns
+ @cindex defining looping instruction patterns
+ 
+ Some machines have special jump instructions that can be utilised to
+ make loops more efficient.  A common example is the 68000 @samp{dbra}
+ instruction which performs a decrement of a register and a branch if the
+ result was greater than zero.  Other machines, in particular digital
+ signal processors (DSPs), have special block repeat instructions to
+ provide low-overhead loop support.  For example, the TI TMS320C3x/C4x
+ DSPs have a block repeat instruction that loads special registers to
+ mark the top and end of a loop and to count the number of loop
+ iterations.  This avoids the need for fetching and executing a
+ @samp{dbra}-like instruction and avoids pipeline stalls asociated with
+ the jump.
+ 
+ GNU CC has three special named patterns to support low overhead looping,
+ @samp{decrement_and_branch_until_zero}, @samp{doloop_begin}, and
+ @samp{doloop_end}.  The first pattern,
+ @samp{decrement_and_branch_until_zero}, is not emitted during RTL
+ generation but may be emitted during the instruction combination phase.
+ This requires the assistance of the loop optimizer, using information
+ collected during strength reduction, to reverse a loop to count down to
+ zero.  Some targets also require the loop optimizer to add a
+ @code{REG_NONNEG} note to indicate that the iteration count is always
+ positive.  This is needed if the target performs a signed loop
+ termination test.  For example, the 68000 uses a pattern similar to the
+ following for its @code{dbra} instruction:
+ 
+ @smallexample
+ @group
+ (define_insn "decrement_and_branch_until_zero"
+   [(set (pc)
+ 	(if_then_else
+ 	  (ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am")
+ 		       (const_int -1))
+ 	      (const_int 0))
+ 	  (label_ref (match_operand 1 "" ""))
+ 	  (pc)))
+    (set (match_dup 0)
+ 	(plus:SI (match_dup 0)
+ 		 (const_int -1)))]
+   "find_reg_note (insn, REG_NONNEG, 0)"
+   "...")
+ @end group
+ @end smallexample
+ 
+ Note that since the insn is both a jump insn and has an output, it must
+ deal with its own reloads, hence the `m' constraints.  Also note that
+ since this insn is generated by the instruction combination phase
+ combining two sequential insns together into an implicit parallel insn,
+ the iteration counter needs to be biased by the same amount as the
+ decrement operation, in this case -1.  Note that the following similar
+ pattern will not be matched by the combiner.
+ 
+ @smallexample
+ @group
+ (define_insn "decrement_and_branch_until_zero"
+   [(set (pc)
+ 	(if_then_else
+ 	  (ge (match_operand:SI 0 "general_operand" "+d*am")
+ 	      (const_int 1))
+ 	  (label_ref (match_operand 1 "" ""))
+ 	  (pc)))
+    (set (match_dup 0)
+ 	(plus:SI (match_dup 0)
+ 		 (const_int -1)))]
+   "find_reg_note (insn, REG_NONNEG, 0)"
+   "...")
+ @end group
+ @end smallexample
+ 
+ The other two special looping patterns, @samp{doloop_begin} and
+ @samp{doloop_end}, are emitted by the loop optimiser for certain
+ well-behaved loops with a finite number of loop iterations using
+ information collected during strength reduction.  
+ 
+ The @samp{doloop_end} pattern describes the actual looping instruction
+ (or the implicit looping operation) and the @samp{doloop_begin} pattern
+ is an optional companion pattern that can be used for initialisation
+ needed for some low-overhead looping instructions.
+ 
+ Note that some machines require the actual looping instruction to be
+ emitted at the top of the loop (e.g., the TMS320C3x/C4x DSPs).  Emitting
+ the true RTL for a looping instruction at the top of the loop can cause
+ problems with flow analysis.  So instead, a dummy @code{doloop} insn is
+ emitted at the end of the loop.  The machine dependent reorg pass checks
+ for the presence of this @code{doloop} insn and then searches back to
+ the top of the loop, where it inserts the true looping insn (provided
+ there are no instructions in the loop which would cause problems).  Any
+ additional labels can be emitted at this point.  In addition, if the
+ desired special iteration counter register was not allocated, this
+ machine dependent reorg pass could emit a traditional compare and jump
+ instruction pair.
+ 
+ The essential difference between the
+ @samp{decrement_and_branch_until_zero} and the @samp{doloop_end}
+ patterns is that the loop optimizer allocates an additional pseudo
+ register for the latter as an iteration counter.  This pseudo register
+ cannot be used within the loop (i.e., general induction variables cannot
+ be derived from it), however, in many cases the loop induction variable
+ may become redundant and removed by the flow pass.
+ 
  
  @node Insn Canonicalizations
  @section Canonicalization of Instructions


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]