This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Questions about peephole2



>On Wed, Apr 17, 2002 at 04:22:08PM -0400, Peter Barada wrote:
>> 	fmove.d (%a1),%fp1	| 36	movdf_v4e/1
>> 	fadd.d %fp1,%fp0	| 20	adddf3_v4e
>[..]
>> Insn 36 could be folded into insn 20 if it is known that %fp1 dies at
>> the end of instruction 20, producing the code:
>
>If adddf3_v4e ought to match here, I encourage you *not* to add random
>peepholes here to work around a failing in the register allocator.

My defintion of adddf3_v4e is:

(define_insn "adddf3_v4e"
  [(set (match_operand:DF 0 "nonimmediate_operand" "=f")
	(plus:DF (match_operand:DF 1 "general_operand" "%0")
		 (match_operand:DF 2 "general_operand" "f<Q>U")))]
  "TARGET_CFV4E"
  "fadd%.d %2,%0")

Where "f<Q>U" allows an fpu register, pre-dec, post-inc, reg-indirect
and reg-offset.

Since the ColdFire FPU can't deal with a constant or a symbolic
address as an operand, I had to modify SECONDARY_RELOAD_CLASS to
be:

enum reg_class
secondary_reload_class (class, mode, in)
     enum reg_class class;
     enum machine_mode mode;
     rtx in;
{
  int regno = -1;
  enum rtx_code code = GET_CODE (in);

  if (!TARGET_CFV4
      || (mode != SFmode && mode != DFmode))
    return NO_REGS;

  if (! CONSTANT_P (in))
    {
      regno = true_regnum (in);

      /* A pseudo is the same as memory.  */
      if (regno == -1 || regno >= FIRST_PSEUDO_REGISTER)
	code = MEM;
    }

  /* If we want to move a CONST_DOUBLE or a symbolic memory operand to/from
     a FP_REG, then use ADDR_REGS as the intermediary */
  if (class == FP_REGS &&
      (code == CONST_DOUBLE || (code == MEM && (symbolic_operand(in, mode)))))
    return ADDR_REGS;

  return NO_REGS;
}


And create reload patterns following the pattern used in pa.md and
clone emit_move_sequence() from pa/pa.c:

(define_expand "reload_indf"
  [(set (match_operand:DF 0 "nonimmediate_operand" "=f")
	(match_operand:DF 1 "general_operand" "mf"))
   (clobber (match_operand:SI 2 "register_operand" "=&a"))]
  "TARGET_CFV4E"
  "
{
  if (emit_move_sequence (operands, DFmode, operands[2]))
    DONE;

  /* We don't want the clobber emitted, so handle this ourselves. */
  emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1]));
  DONE;
}")


Here's the results, before and after reload.  From xx.c.19.lreg:

(insn 20 103 21 (set (reg:DF 34)
        (plus:DF (reg:DF 34)
            (const_double:DF (const_int 0 [0x0]) 0 [0x0] 0 [0x0] 1075806208 [0x401f8000]))) 141 {adddf3_v4e} (nil)
    (nil))

To xx.c.20.greg:

(insn 105 103 106 (set:DF (reg:SI 9 %a1)
        (symbol_ref/u:SI ("*.LC0"))) 41 {movsi_cfv4} (nil)
    (nil))

(insn 106 105 20 (set:DF (reg:DF 18 %fp2)
        (mem:DF (reg:SI 9 %a1) 0)) 57 {movdf_v4e} (nil)
    (nil))

(insn 20 106 21 (set (reg:DF 16 %fp0 [34])
        (plus:DF (reg:DF 16 %fp0 [34])
            (reg:DF 18 %fp2))) 141 {adddf3_v4e} (nil)
    (nil))

emit_move_sequence() creates the two insn sequence:

(gdb) call debug_rtx_list(cfun->emit->x_first_insn, 100)

(insn 105 0 106 (set:DF (reg:SI 9 %a1)
        (symbol_ref/u:SI ("*.LC0"))) -1 (nil)
    (nil))

(insn 106 105 0 (set:DF (reg:DF 18 %fp2)
        (mem:DF (reg:SI 9 %a1) 0)) -1 (nil)
    (nil))
(gdb) 

And the resultant assembler:

	lea .LC0,%a1	| 105	movsi_cfv4/1
	fmove.d (%a1),%fp2	| 106	movdf_v4e/1
	fadd.d %fp2,%fp0	| 20	adddf3_v4e


With all that, is there anything I'm missing which is preventing
the add(insn 20) from subsuming the load indirect(insn 106) created
from the reload pass?

If the combiner ran again at this point, it should be able to put insn
106 and 20 together since its just a memory indirect load at this
point.

>Instead, help with the new register allocator implementation on
>new-regalloc-branch.

I'd love to, but I can't access cvs repositories through Motorola's
firewall.

>I encourage you to use predicates above such as
>
>  m68k_address_register_operand
>
>etc.  It's much much clearer.

Thanks for the suggestions.  I've modified it to be:

(define_peephole2
  [(set (match_operand:DF 0 "m68k_freg_operand" "")
        (mem:DF (match_operand:SI 1 "m68k_areg_operand" "")))
   (set (match_operand:DF 2 "m68k_freg_operand" "")
        (match_operator:DF 3 "math_operator"
                 [(match_dup 2)
                 (match_dup 0)]))]
  "TARGET_CFV4E && peep2_reg_dead_p (2, operands[0])
   && REGNO (operands[0]) != REGNO (operands[2])"
  [(set (match_dup 2)
        (plus:DF (match_dup 2)
                 (mem:DF (match_dup 1))))]
  "")

where m68k_freg_operand() is:

int
m68k_freg_operand (x, mode)
     rtx x;
     enum machine_mode mode;
{
  if (!register_operand (x, mode) || !FP_REG_P (x))
    return 0;
  return 1;
}

-- 
Peter Barada                                   Peter.Barada@motorola.com
Wizard                                         781-852-2768 (direct)
WaveMark Solutions(wholly owned by Motorola)   781-270-0193 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]