This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Review, suggestions

I think you've made a mistake by adding the R constraint. You've done it in order to split memory alternatives into those appropriate for a delay slot, and those that aren't. However, there is a better way than replicating alternatives like that. Instead, use computed rather than literal attributes.

First, I'd drop all the "no_delay_*" members of attr "type". In addition, icmp isn't used and darith doesn't seem to be useful. We will need to add arith1 for arith operations with only a single rtx input.

That of course means that define_delay has to change. That's fine:

(define_delay (eq_attr "type" "branch,call,jump")
  [(cond [(eq_attr "type" "branch,call,jump,multi")
            (const_int 0)
          (eq_attr "type" "fadd,frsub,fmul,fdiv,fcmp,store,load")
            (and (eq_attr "length" "4")
                 (symbol_ref "microblaze_no_unsafe_delay")))
         (eq_attr "length" "4"))
   (nil) (nil)])

Aside from the more readable type filtering, we require that any instruction to be put in a delay slot be 4 bytes long.

Next, adjust length computation:

(define_attr "length" ""
  (cond [(eq_attr "type" "load")
            (if_then_else (match_operand 1 "double_memory_operand" "")
              (const_int 8)
              (const_int 4))
         (eq_attr "type" "store")
            (if_then_else (match_operand 0 "double_memory_operand" "")
              (const_int 8)
              (const_int 4))
         (eq_attr "type" "arith")
            (if_then_else (match_operand 2 "long_constant_operand" "")
              (const_int 8)
              (const_int 4))

         ;; Require "multi" insns to have a length specified.
	 (eq_attr "type" "multi")
           (symbol_ref "(gcc_unreachable(), 0)")

	 (eq_attr "type" "call")

          ;; ... any other special types I've forgotten
        (const_int 4)))

This is, of course, the *default*, which can be overridden on a case-by-case basis by the individual instructions. However, I think that I've just covered 95% of the patterns that aren't obviously multiple insns.

Which means that, for example,

(define_insn "zero_extendhisi2"
  [(set (match_operand:SI 0 "register_operand" "=d,d,d")
        (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "d,R,m")))]
  [(set_attr "type"     "no_delay_arith,load,no_delay_load")
  (set_attr "mode"      "SI,SI,SI")
  (set_attr "length"    "8,4,8")])

reduces to

(define_insn "zero_extendhisi2"
  [(set (match_operand:SI 0 "register_operand" "=d,d")
          (match_operand:HI 1 "nonimmediate_operand" "d,m")))
  [(set_attr "type" "arith,load")
   (set_attr "length" "8,*")])

Note that the length of the first alternative is specified explicitly since there's nothing in the rtl itself that could determine that this is an AND with a long immediate. The length of the second alternative is explicitly set to default, so the above computation will come into effect.

> (define_insn "*one_cmpldi2"
> (define_insn "anddi3"
> (define_insn "iordi3"
> (define_insn "xordi3"

Drop all of your double-word logical operations and associated splitters; they do just what the generic expander code would do.

> (define_insn "extendsidi2"
> [(set (match_operand:DI 0 "register_operand" "=d,d,d")
> (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "d,R,m")))]
> ""
> {
> if (which_alternative == 0)
> output_asm_insn ("addk\t%D0,r0,%1", operands);
> else
> output_asm_insn ("lw%i1\t%D0,%1", operands);
> output_asm_insn ("add\t%0,%D0,%D0", operands);
> output_asm_insn ("addc\t%0,r0,r0", operands);
> output_asm_insn ("beqi\t%0,.+8", operands);
> return "addi\t%0,r0,0xffffffff";
> }
> [(set_attr "type" "multi,multi,multi")
> (set_attr "mode" "DI")
> (set_attr "length" "20,20,20")])

Ug. Surely we can do better than this.

I suspect the best way to approach this problem is to let the generic code do most of the work. If we don't implement extendsidi2, then convert_move will call emit_store_flag, which will try (x >> 31). So all we have to do is make sure that's optimized well.

If TARGET_BARREL_SHIFT, then we're done. Otherwise,

(define_insn "*ashrsi3_by31"
  [(set (match_operand:SI 0 "register_operand" "=d")
        (ashiftrt:SI (match_operand:SI 1 "register_operand" "d")
                     (const_int 31)))]

That's already one instruction shorter than your sequence above, and it'll also be used for other things.

I think you can implement both arithmetic and logical right shift via rotate, similar to ashlsi3_with_rotate. E.g.

(define_insn "*ashrsi3_with_rotate"
  [(set (match_operand:SI 0 "register_operand" "=d")
        (ashiftrt:SI (match_operand:SI 1 "register_operand" "d")
                     (match_operand:SI 2 "const_int_operand" "n")))]
  "INTVAL (operands[2]) >= 17"
  int i, nshift = INTVAL (operands[2]);

operands[3] = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);

  /* The first bit is special; we want to replicate the sign bit to
     all bits of the destination.  All other bits shift in normally.  */
  output_asm_insn ("add\t%3,%1,%1", operands);
  output_asm_insn ("addkc\t%0,r0,r0", operands);
  output_asm_insn ("rsubk\t%0,%0,r0", operands);

  for (i = 31; i > nshift; --i)
      output_asm_insn ("add\t%3,%3,%3", operands);
      output_asm_insn ("addkc\t%0,%0,r0, operands);

Similarly for logical right shift.

> (define_insn "*ashlsi3_with_mul_delay"
>   [(set (match_operand:SI 0 "register_operand" "=d")
>         (ashift:SI (match_operand:SI 1 "register_operand"  "d")
>                    (match_operand:SI 2 "immediate_operand" "I")))]


> (define_insn "*ashlsi3_with_mul_nodelay"
>   [(set (match_operand:SI 0 "register_operand" "=d")
>         (ashift:SI (match_operand:SI 1 "register_operand"  "d")
>                    (match_operand:SI 2 "immediate_operand" "I")))]

Err, this either needs to be merged with the previous, or you need to use more specific predicates. So, either

  const_0_14_operand	-- for ashlsi3_with_mul_delay
  const_int_operand	-- for ashlsi3_with_mul_nodelay

or choose a letter constraint for 0..14 and then you can use 2 alternatives.

> (define_insn "*ashlsi3_with_rotate"

You can remove the final ANDI from by using MB_ABI_ASM_TEMP_REGNUM as an intermediate.

More to come...


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]