This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: Using (set (if_then_else ...)) on IA64 for division instructions


I have been looking at how we generate code to do division on IA64 to
see if I could get it to be scheduled better.  Currently IA64 generates
a sequence of instructions using cond_exec, the problem with this is
that cond_exec cannot be expanded until after reload and this results in
the division instruction sequence being poorly scheduled.

For those not familiar with IA64, it generates an frcpa instruction
(floating point reciprical approximation) followed by a series of
add/mult/minus instructions which are conditional based on a predicate
register set by frcpa.

One thought I had was to generate an explicit if_then_else around the
predicated instructions so that cond_exec was not needed.  This works
but the instruction scheduling is not much better.  I think the fact
that it adds explicit control flow to a previously straight-line piece
of code hampers the instruction scheduler.

Then I looked at how smaxdi3 and some other instructions were
implemented on alpha with an if_then_else inside a set instruction and
tried that.  This appears to be giving me the kind of code that I want
but before I proceeded much farther I thought I would see if anyone
had any comments or critiques of this approach.

The division code sequences already use special instructions (using an
_alts suffix) in order to specify the use of a different floating point
status register so my change would be to modify the _alts instructions
to also take a specific BImode predicate register and use that in a
if_then_else inside the set.

So, to use an XF multiplication instruction as an example, currently we
have:

(define_insn "*mulxf3_alts"
  [(set (match_operand:XF 0 "fr_register_operand" "=f")
        (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG")
                 (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")))
   (use (match_operand:SI 3 "const_int_operand" ""))]
  ""
  "fmpy.s%3 %0 = %F1, %F2"
  [(set_attr "itanium_class" "fmac")])


and we use that instruction with a cond_exec to help implement the
division code.  I would change this to:


(define_insn "mulxf3_alts"
  [(set (match_operand:XF 0 "fr_register_operand" "=f")
        (if_then_else:XF (ne (match_operand:BI 1 "register_operand" "")
                                 (const_int 0))
          (mult:XF
            (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")
            (match_operand:XF 3 "xfreg_or_fp01_operand" "fG"))
          (match_dup 0)))
   (use (match_operand:SI 4 "const_int_operand" ""))]
  ""
  "(%1) fmpy.s%4 %0 = %F2, %F3"
  [(set_attr "itanium_class" "fmac")
   (set_attr "predicable" "no")])

so that I can use the instruction without a cond_exec and can now
express the division code sequence with a define_expand.  I haven't done
any performance measurements but I tried this with divsf3 and it did
give me better instruction scheduling.  With something like '(a/b) -
(c/d)', the current code does the two division sequences one after the
other.  With my changes the two sequences are completely interleaved.

Comments?  Am I missing any obvious problems?  Are there any special
problems with using if_then_else inside a set?

Steve Ellcey
sje@cup.hp.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]